Lands L3 of the agent-first coverage architecture (ADR-020) — the
mutation-testing layer. Stryker on entities + use-cases (the pure
business-logic surface) catches the third dimension of test quality:
tests that exist + execute the code but assert nothing.
Deps (root devDependencies):
- @stryker-mutator/core ^8.7.0
- @stryker-mutator/vitest-runner ^8.7.0
Shared base: packages/core-testing/stryker.base.json
- testRunner: vitest (uses each feature's vitest.config.ts)
- mutate: src/entities/** + src/application/use-cases/** (excludes
tests, factories, contracts)
- thresholds: high 90 / low 80 / break 80
- reporters: progress + html + json (reports/mutation/{index.html,
mutation.json})
- incremental mode enabled, concurrency 4, timeout 10s
- exposed via @repo/core-testing/stryker.base.json subpath export
Per-feature config: packages/auth/stryker.config.json
- 4-line file that extends the shared base
- Proof-of-concept; other features get a config when L0 unification
closes their existing test gaps
Driver: scripts/coverage/mutate.mjs (zero-dep Node ESM)
- discoverStrykerConfigs: walks packages/* and apps/* for
stryker.config.json
- Supports --filter <name>, --since <ref> (incremental), --json
- Runs Stryker per-feature via node_modules/.bin/stryker run
- Surfaces per-package pass/fail summary; exits 1 on any failure
- Tests: scripts/coverage/mutate.test.mjs (3 tests, all green)
CI: .github/workflows/mutation-nightly.yml
- Cron at 02:30 UTC + workflow_dispatch with filter input
- Uploads reports/mutation/** as artifact (30-day retention)
- On failure, opens a tracking issue labelled mutation-testing
- permissions: contents: read, issues: write
- 60-min timeout (Stryker is slow by design)
Generator: turbo gen feature now scaffolds stryker.config.json from
turbo/generators/templates/feature/stryker.config.json.hbs — new
features ship mutation-ready out of the box.
Guide: docs/guides/coverage.md L3 section fleshed out with run
syntax, config shape, base config inventory, CI behavior, and a
"what you're looking for" primer on mutation scores.
Lockfile churn: pnpm regenerated the lockfile for the new deps;
~5K-line net reduction is collateral (pnpm version drift) but
mechanical.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
11 KiB
Coverage
Architecture: ADR-020. Glossary: docs/glossary.md → Coverage.
The agent-first coverage architecture has four layers. This guide is the day-to-day reference for working with them.
The four layers at a glance
| Layer | Question it answers | Command |
|---|---|---|
| L0 Per-layer vitest thresholds | "Did the last test run meet the declared bands?" | pnpm test -- --coverage |
| L1 Diff coverage | "Did this PR/slice cover its own changed lines?" | pnpm coverage:diff |
| L2 Aggregate trend | "How is coverage trending across the repo?" | pnpm coverage:aggregate → coverage/summary.json |
| L3 Mutation testing | "Do my tests actually assert anything?" | pnpm mutate (opt-in, not in default pnpm test) |
Each layer answers a distinct question. They compose, none replaces the others.
Single source of truth: feature.manifest.ts
Every feature declares its coverage expectations once, in feature.manifest.ts:
export const myFeatureManifest = defineFeature({
// ...
coverage: {
bands: {
baseline: { statements: 80, branches: 75, functions: 80, lines: 80 },
entities: { statements: 100, branches: 100, functions: 100, lines: 100 },
"use-cases": {
statements: 100,
branches: 95,
functions: 100,
lines: 100,
},
controllers: {
statements: 100,
branches: 95,
functions: 100,
lines: 100,
},
},
mutationTargets: ["entities", "use-cases"],
},
} as const);
Three readers pick this up:
vitest.config.ts— usesvitestThresholdsFromBands(DEFAULT_COVERAGE_BANDS)from@repo/core-shared/conformance/coverage. Most features importDEFAULT_COVERAGE_BANDSdirectly (the manifest'scoveragesection matches the defaults). For features with custom bands, override at the vitest config too.assertFeatureConformance— at app boot, reads the manifest's bands and asserts the produced lcov meets them. (Boot wiring lands in the next story.)pnpm coverage:diff— uses the bands for per-path expectations against the merged lcov.
Edit the manifest. The other readers pick up the change.
Daily workflow
Before pushing
pnpm test -- --coverage # L0 — per-package thresholds enforced
pnpm coverage:aggregate # L2 — produce coverage/lcov.info + summary.json
pnpm coverage:diff # L1 — fails if changed lines aren't covered
The diff coverage step compares against origin/main by default. To compare against a different base:
pnpm coverage:diff -- --base HEAD~1
pnpm coverage:diff -- --base origin/release
For machine consumption (e.g., the agent dispatch loop):
pnpm coverage:diff -- --json | jq .uncovered
Reading a failure
pnpm coverage:diff exits with code 1 and emits both stdout (JSON) and stderr (summary):
stderr (human):
[coverage:diff] FAIL — 3 uncovered hit(s) across 2 file(s):
packages/blog/src/application/use-cases/publish-article.use-case.ts
uncovered lines: 47, 48
packages/auth/src/entities/models/session.ts
uncovered lines: 22
stdout (JSON, also written for the dispatch loop):
{
"status": "fail",
"summary": {
"filesChanged": 4,
"filesGated": 2,
"uncoveredCount": 3
},
"fileSummaries": [...],
"uncovered": [
{ "file": "...", "line": 47, "kind": "uncovered" },
{ "file": "...", "line": 48, "kind": "uncovered" },
{ "file": "...", "line": 22, "kind": "uncovered" }
]
}
kind is one of:
uncovered— line is executable per lcov, execution count is 0no-coverage-data— entire file isn't in lcov (likely a new untested file)
Fixing an uncovered slice
- Read the JSON. For each
uncoveredhit, navigate to<file>:<line>. - Identify which test would have exercised that line. Usually it's missing a branch case or an error path.
- Add the test (TDD: write failing test → make it green).
- Re-run
pnpm test --coverage --filter @repo/<feature>to verify. - Re-run
pnpm coverage:diffto confirm exit 0.
For no-coverage-data hits, write the sibling test file — vitest's ESLint conformance rule usecase-must-have-test-file will start failing anyway if you don't.
What's exempt (the allowlist)
The diff coverage gate skips:
- Test files (
*.test.ts,*.test.tsx,*.test.mjs) - Fixtures, factories, contracts, seeds (
__fixtures__/,__factories__/,__contracts__/,__seeds__/) - Config files (
*.config.{ts,js,mjs,cjs},package.json,tsconfig.*.json,turbo.json) - Docs and data (
*.md,*.json,*.yaml,.gitignore,.npmrc) - Shell scripts (
*.sh,*.bash) - Dev tooling under
scripts/andturbo/generators/ - Per-feature excludes mirrored from vitest (
di/bind-production.ts,application/repositories/**,application/services/**,integrations/cms/**,ui/**,*.interface.ts,index.tsbarrels) - Build artifacts (
dist/,.next/,.turbo/,node_modules/,coverage/)
The allowlist lives in scripts/coverage/diff.mjs and is unit-tested.
Adjusting bands
To raise the bar on a feature
Edit packages/<feature>/src/feature.manifest.ts:
coverage: {
bands: {
baseline: { statements: 90, branches: 85, functions: 90, lines: 90 }, // tighter
entities: { statements: 100, branches: 100, functions: 100, lines: 100 },
"use-cases": { statements: 100, branches: 100, functions: 100, lines: 100 }, // bumped branches
controllers: { statements: 100, branches: 95, functions: 100, lines: 100 },
},
}
If the new bands are stricter than the defaults, also update packages/<feature>/vitest.config.ts to use vitestThresholdsFromManifest(myFeatureManifest) instead of DEFAULT_COVERAGE_BANDS. (Note: importing the manifest from a vitest config has tooling constraints — see the DEFAULT_COVERAGE_BANDS route as the default path.)
To skip a layer
Omit it from bands. The layer falls through to baseline:
coverage: {
bands: {
baseline: { ... },
entities: { ... },
// controllers omitted -> matches baseline
},
}
CI behavior
.github/workflows/ci.yml (validate job) runs three coverage steps after the test step:
- Test with coverage — produces per-package
coverage/lcov.info - Coverage — aggregate (L2) — merges to root
coverage/lcov.info+coverage/summary.json - Coverage — diff (L1) — only on pull requests, diffs against
origin/<base-ref>
On merge to main, .github/workflows/coverage-snapshot.yml re-aggregates and commits the updated coverage/summary.json back to main. Trend history accumulates via git log -- coverage/summary.json.
Reading the trend
git log --oneline --follow -- coverage/summary.json | head -10
git show <sha> -- coverage/summary.json | grep -E '"statements"|"branches"'
coverage/summary.json is the only committed coverage artifact. Each snapshot includes:
generatedAt— ISO timestampcommit— short SHArepo— repo-wide percentages + raw countsbyPackage— per-package percentages, keyed by@repo/<name>
Mutation testing (L3)
Stryker mutation testing on entities/ + application/use-cases/ — the pure-business-logic surface. Not part of pnpm test (slow); runs on-demand and nightly via GH Action.
Running
pnpm mutate # every feature with a stryker.config.json
pnpm mutate -- --filter @repo/auth # one feature
pnpm mutate -- --since main # incremental against base ref
pnpm mutate -- --json # machine-readable summary
Configuration
Each feature has a slim stryker.config.json that extends the shared base:
{
"$schema": "../../node_modules/@stryker-mutator/core/schema/stryker-schema.json",
"extends": "@repo/core-testing/stryker.base.json"
}
The base lives at packages/core-testing/stryker.base.json and defines:
- Test runner: vitest (uses each feature's
vitest.config.ts) - Scope:
src/entities/**/*.tsandsrc/application/use-cases/**/*.ts(excludes tests/factories/contracts) - Thresholds: high 90 / low 80 / break 80 (
breakis the fail threshold) - Reporters: progress, html (
reports/mutation/index.html), json (reports/mutation/mutation.json) - Incremental mode: enabled (subsequent runs skip mutants whose source + tests haven't changed)
- Concurrency: 4 workers
To override per feature (rare), add fields to the feature's stryker.config.json:
{
"extends": "@repo/core-testing/stryker.base.json",
"thresholds": { "high": 95, "low": 85, "break": 85 },
"mutate": ["src/entities/**/*.ts"]
}
CI: nightly run + on-demand
.github/workflows/mutation-nightly.yml runs Stryker across every feature at 02:30 UTC + on workflow_dispatch. The dispatch UI accepts a filter input (e.g. @repo/auth) for targeted reruns. Reports uploaded as the mutation-reports artifact (30-day retention). On meaningful score drops it opens a tracking issue labelled mutation-testing.
What you're looking for
Stryker's mutation.json reports the mutation score (killed mutants / total) per file. A surviving mutant means: the mutator changed source code (e.g., < → <=, && → ||, removed a line, etc.), reran the tests, and they STILL passed. That's a test that exists + executes the code but doesn't actually assert behavior.
Fix: read the surviving mutant's diff in reports/mutation/index.html, identify the assertion that should have caught it, add the assertion.
Troubleshooting
"Cannot find module '@vitest/coverage-v8'" — your feature's package.json is missing @vitest/coverage-v8 as a dev dep. Add it. (This was the issue surfaced for media during the L0 audit.)
"Coverage for lines (X%) does not meet 'src/...' threshold (Y%)" — L0 failure. Real test gap. Either write the missing test or adjust the manifest band downward (rare; band relaxation should be justified).
pnpm coverage:diff says "lcov file not found" — run pnpm test -- --coverage && pnpm coverage:aggregate first. The diff script reads the merged root coverage/lcov.info.
coverage/summary.json differs every commit — expected. It includes generatedAt (ISO timestamp) and commit (SHA). The snapshot workflow only commits it when the underlying numbers change; in local dev, regenerating it shows diff noise.
Diff coverage flags a file I don't think should be gated — check the allowlist in scripts/coverage/diff.mjs. If the file genuinely shouldn't be gated, extend the allowlist (and the tests in diff.test.mjs).
Related
- ADR-020 — full architectural rationale
- ADR-011 — original TDD foundation (the thresholds originated here)
- PRD 2026-05-13-coverage-architecture — implementation seed with audit findings
- docs/glossary.md — canonical vocabulary
- docs/guides/conformance-quickref.md — sibling reference for the 5-gate conformance system