World-class pytest engineer for Python: write/refactor tests, fix flakiness, design fixtures/markers, add coverage, speed up suites (collection/runtime), and optimize CI (GitHub Actions sharding, xdist parallelism, caching). Use when asked about pytest best practices, pytest 9.x features (subtests, strict mode, TOML config), pytest plugins (xdist/cov/asyncio/mock/httpx), or test performance/CI tuning.
Installation
Details
Usage
After installing, this skill will be available to your AI coding assistant.
Verify installation:
npx agent-skills-cli listSkill Instructions
name: pytest-dev description: "World-class pytest engineer for Python: write/refactor tests, fix flakiness, design fixtures/markers, add coverage, speed up suites (collection/runtime), and optimize CI (GitHub Actions sharding, xdist parallelism, caching). Use when asked about pytest best practices, pytest 9.x features (subtests, strict mode, TOML config), pytest plugins (xdist/cov/asyncio/mock/httpx), or test performance/CI tuning."
pytest-dev
Produce high-signal, low-flake, fast pytest suites and CI configs, with an explicit focus on measurable wins (runtime, flake rate, coverage quality).
Default workflow
- Classify the tests
- Unit: pure functions, no I/O (preferred)
- Integration: DB/filesystem/multiprocess, slower but valuable
- System/E2E: external services or UI, keep minimal and well-gated
- Identify boundaries
- Time/clock, randomness, network, filesystem, DB, env vars, global state
- Pick the lightest seam
- Prefer fakes/stubs over deep mocks; prefer dependency injection over patching internals
- Make it deterministic
- Control time, seeds, tmp dirs; avoid order dependencies
- Measure before optimizing
- Collection time vs runtime; quantify with
--durations+ a single baseline
- Collection time vs runtime; quantify with
- Harden for CI
- Enforce marker discipline, strict config, timeouts, isolation for parallel
Quick commands
Use python3 by default. If the project uses uv, prefer uv run python.
- Smallest repro:
python3 -m pytest path/to/test_file.py -q - First failure only:
python3 -m pytest -x --maxfail=1 - Find slow tests:
python3 -m pytest --durations=20 --durations-min=0.5 - Emit JUnit for CI:
python3 -m pytest --junitxml=reports/junit.xml - Parallelize on one machine (xdist):
python3 -m pytest -n auto --dist load
Optimization playbook (high ROI)
- Reduce collection scope (
testpaths,norecursedirs, avoid importing heavy modules at import time). - Fix fixture scoping (move expensive setup up-scope; ensure isolation).
- Eliminate sleeps and retries (poll with timeouts; mock time).
- Parallelize safely (xdist; isolate worker resources: tmp/db ports).
- Shard in CI (split test files by historical timings; keep shards balanced).
Use the bundled references
Read these when needed (keep SKILL.md lean):
references/pytest_core.md: fixtures, markers, parametrization, strict mode, TOML config, subtests (pytest 9.x).references/plugins.md: plugin selection + usage patterns.references/performance.md: collection/runtime profiling and speedups.references/ci_github_actions.md: sharding, artifacts, caching, concurrency.
Use the bundled scripts
scripts/junit_slowest.py: report slowest tests/files from JUnit XML.scripts/junit_split.py: split test files into N shards using JUnit timings.scripts/run_pytest_filelist.py: run pytest for a list of test files.
Quality gates
- Tests pass in a clean environment (no hidden dependency on local state).
- No network/time dependency without explicit control.
- Parallel-safe or explicitly marked/serialized.
- CI emits machine-readable artifacts when relevant (JUnit, coverage).
More by BjornMelin
View allWorld-class Vitest QA/test engineer for TypeScript + Next.js (local + CI performance focused)
Expert guidance for building AI agents with ToolLoopAgent (AI SDK v6+). Use when creating agents, configuring stopWhen/prepareStep, callOptionsSchema/prepareCall, dynamic tool selection, tool loops, or agent workflows (sequential, routing, evaluator-optimizer, orchestrator-worker). Triggers: ToolLoopAgent, agent loop, stopWhen, stepCountIs, prepareStep, callOptionsSchema, prepareCall, hasToolCall, InferAgentUIMessage, agent workflows.
Expert guidance for Zod v4 schema validation in TypeScript. Use when designing schemas, migrating from Zod 3, handling validation errors, generating JSON Schema/OpenAPI, using codecs/transforms, or integrating with React Hook Form, tRPC, Hono, or Next.js. Covers all Zod v4 APIs including top-level string formats, strictObject/looseObject, metadata, registries, branded types, and recursive schemas.
Expert guidance for building Dash applications with Dash Mantine Components (DMC) v2.x. Use when creating dashboards, forms, data visualization apps with DMC. Covers: MantineProvider theming, style props (m, p, c, bg, w, h), Styles API, callbacks (basic, pattern-matching ALL/MATCH/ALLSMALLER, clientside, background), multi-page apps with Dash Pages, charts (LineChart, BarChart, DonutChart), date pickers, modals, and all 100+ components. Triggers on: dash-mantine-components, DMC, MantineProvider, dmc.Button, dmc.Select, dmc.Modal, dmc.BarChart, Mantine theme, Dash UI components, Dash callbacks, multi-page Dash app, pattern-matching callbacks, clientside callbacks, AppShell.
