Architecture record for the agent-first coverage initiative seeded by the 2026-05-13 PRD. Captures the durable decisions: - 4-layer architecture (L0 vitest, L1 diff, L2 aggregate, L3 mutation) - Manifest-driven coverage band as single source of truth (vitest + assertFeatureConformance + pnpm coverage:diff all read from it) - Cover-the-diff (changed lines), not cover-the-new-code - Committed coverage/summary.json (no SaaS), trend via git log - Mutation testing scoped to entities + use-cases, on-demand only - Machine-first output format (JSON stdout, human stderr) Glossary gets a new "Coverage" section with 7 entries (coverage band, L0-L3 layers, diff coverage, mutation testing, mutation score, coverage/summary.json), plus two relationship rows and a flagged ambiguity for "coverage" qualifiers. prompt-context.sh hook gets a 9th keyword group — when a prompt mentions coverage / uncovered / lcov / mutation / stryker, the relevant ADR + guide path are injected as additional context for the turn. This is the documentation layer of the coverage epic. Implementation (manifest schema, vitest auto-derive, scripts, boot assertion, mutation tooling) lands in subsequent stories. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
10 KiB
ADR-020 — Agent-first coverage architecture (4 layers + manifest-driven thresholds)
Status: Accepted Date: 2026-05-13 Builds on: ADR-006 (vertical-feature-packages), ADR-011 (TDD foundation) PRD: docs/work/prds/2026-05-13-coverage-architecture.prd.md
Context
ADR-011 established the TDD foundation: per-package vitest configs with V8 coverage, a coverage baseline (80/75/80/80) and stricter per-layer bands (100% on entities/, application/use-cases/, interface-adapters/controllers/). The ESLint conformance rule usecase-must-have-test-file enforces that a test file exists. CI runs pnpm test -- --coverage and uploads **/coverage/lcov.info as an artifact.
This leaves five gaps that matter especially in an agent-first repo:
- No diff-coverage gate. A slice can ship without exercising its new lines. The ESLint rule only checks file presence; the per-package thresholds catch only large drops.
- No aggregate visibility. N separate lcov files; no merged view, no trend over time.
- Threshold declarations are duplicated across 5+
vitest.config.tsfiles. Drift is mechanical to spot (we did it during the 2026-05-13 brainstorm:@repo/mediahad nocoverage:block at all;@repo/navigationfailed its declared layer thresholds in entities + controllers). - 100% coverage with weak assertions is invisible. Coverage doesn't measure whether tests would catch real regressions. Mutation testing — explicitly deferred in ADR-011 — is the next signal.
- Coverage data isn't agent-readable. The HTML report serves humans; the dispatch loop has no way to ask "did my slice cover its diff?".
Decision
1. Adopt a 4-layer coverage architecture mirroring the 5-gate conformance philosophy (multi-latency, machine-readable, agent-first):
| Layer | Catches | Latency | Surface |
|---|---|---|---|
| L0 Per-layer vitest thresholds | Drift below declared bands | ~5–30s per package | pnpm test --coverage (existing) |
| L1 Diff coverage | Changed line not exercised | ~5s after L0 | pnpm coverage:diff; CI gate; dispatch post-task |
| L2 Aggregate trend | Drift across the codebase over time | ~10s | pnpm coverage:aggregate; committed coverage/summary.json |
| L3 Mutation testing | Tests that exist + execute the code + assert nothing | Minutes | pnpm mutate; on-demand, not default pnpm test |
Each layer answers a distinct question; none replaces the others.
2. Make feature.manifest.ts the single source of truth for coverage expectations. A new coverage: section per feature:
coverage: {
bands: {
"entities": { statements: 100, branches: 100, functions: 100, lines: 100 },
"use-cases": { statements: 100, branches: 95, functions: 100, lines: 100 },
"controllers": { statements: 100, branches: 95, functions: 100, lines: 100 },
baseline: { statements: 80, branches: 75, functions: 80, lines: 80 },
},
mutationTargets: ["entities", "use-cases"],
}
Three readers consume the manifest:
- Vitest —
vitest.config.tsimports the manifest'scoverageand emits itsthresholds. The duplicated block in 5 per-feature vitest configs goes away. assertFeatureConformance— readscoverage/lcov.infofor the package at boot and asserts each band. Graceful degradation inUSE_DEV_SEED=true(warns rather than throws when lcov is absent).pnpm coverage:diff— usesbaselinefor uncategorized files; stricter layer bands override per matching path glob.
This eliminates the duplication that caused the @repo/media drift and centralizes one decision in one place per feature.
3. Diff coverage is cover-the-diff, not cover-the-new-code. Every changed executable line must have execution-count > 0 against the merged lcov. Modified-but-not-new lines count too — catches "agent edited code, didn't update the test." Allowlist: *.test.ts, *.config.*, *.md, *.json, *.mjs, plus the per-package exclude lists.
4. Aggregate trend ships in-tree, not via SaaS. coverage/summary.json is committed on merge to main. Trend readable via git log -- coverage/summary.json. No external service dependency; the dispatch loop can read history without a network call.
5. Mutation testing is opt-in and narrowly scoped. Stryker with @stryker-mutator/vitest-runner, runs on entities/ + application/use-cases/ only. Default mutation-score threshold 80% per feature (tunable per-manifest). Not part of pnpm test. Nightly GH Action surfaces score drift > 5%.
6. Output format is machine-first. pnpm coverage:diff emits JSON to stdout; human-readable summary to stderr. The dispatch loop reads stdout; humans read stderr or the HTML report.
Alternatives considered
- Codecov / Coveralls SaaS. Polished PR comments and trend dashboards, free for OSS. Rejected as the primary L2 store — adds an external dep, makes the dispatch loop dependent on a network call, and the PR-comment UX targets humans (not the primary consumer of this signal). Can be added later as gold-plating without disturbing the architecture.
- Cover-the-new-code instead of cover-the-diff. Lighter touch; ignores modified lines. Rejected — catches less drift. A slice that edits a use case without updating its test should fail, and cover-the-new-code wouldn't notice.
- Keep thresholds in per-package vitest configs. Status quo. Rejected — the 2026-05-13 audit found drift in 2 of 5 features (media had no block at all; navigation's block diverged subtly from the canonical). Manifest centralization is the only durable fix.
- Run mutation testing in default
pnpm test. Rejected — Stryker on entities + use-cases takes minutes. Adding minutes to the default loop violates the constraint ("new gates must not add more than ~30s wall time"). On-demand is the right cadence; nightly catches drift. - Mutation testing across all layers. Rejected for v1 — repository/controller/integration code has too many environmental dependencies to mutate cleanly. Start narrow; expand if signal is high.
- Use ESLint or fallow for diff coverage. Rejected — diff coverage needs runtime data (which lines actually executed), not AST or filesystem state. It belongs alongside
pnpm test, not inpnpm lintorpnpm fallow. - Boot-time coverage assertion is too heavy. Considered. Counter-argument: the assertion is
O(features × lcov-file-size)— small numbers, ~200ms. The graceful-degradation in dev mode means contributors aren't blocked. The payoff — coverage drift caught at the same latency as TypeScript brands — justifies the machinery.
Consequences
Positive:
- Every PR/task is gated on covering its own diff. Agent shipping an untested slice becomes mechanically impossible at the CI step.
- One source of truth per feature for coverage expectations. The
@repo/media-style "no coverage block" drift can't recur. - Trend history lives in the repo.
git log -- coverage/summary.jsonanswers "how has coverage moved over the last quarter?" without leaving the editor. - Mutation testing on the highest-leverage layers (entities + use-cases — the pure-business-logic surface) raises the floor on test quality without slowing the dispatch loop.
- Machine-readable diff-coverage output integrates directly with the dispatch loop's post-task verification, completing the agent-first observability story.
- Coverage joins the 5-gate conformance philosophy as a first-class signal; ADR-020 becomes the row alongside TS brands / ESLint / boot /
pnpm conformance/ fallow / coverage.
Negative:
- Implementation surface is non-trivial: 6–8 stories spanning manifest schema, vitest auto-derive, two new scripts, boot-time assertion, mutation tooling, ADR + guide + glossary + generator + hook updates.
- The boot-time assertion adds a small dependency on
coverage/lcov.infoexisting. Graceful degradation in dev mode handles this, but the implementation needs care. coverage/summary.jsoncommitted on merge introduces a small CI permissions surface (contents: write) gated to the main-branch workflow.- Mutation testing is slow. The nightly cadence is the compromise; on-demand
pnpm mutateis opt-in but rare in practice.
Implementation phasing
In order (each landable independently):
- L0 unification — fix
@repo/media(missing dep + missing config block) and@repo/navigation(real test gaps) so every feature passes its declared bands today. - Manifest schema — extend
feature.manifest.tsshape (Zod schema incore-shared/conformance/) with thecoverage:section. - Vitest auto-derive —
vitest.config.tsper feature imports the manifest and emitscoverage.thresholds. Eliminates duplication. - L1 diff coverage —
scripts/coverage/diff.mjs+pnpm coverage:diffscript + CI gate. - L2 aggregate —
scripts/coverage/aggregate.mjs+pnpm coverage:aggregate+ summary.json + merge-to-main workflow. - L3 mutation testing — Stryker setup +
pnpm mutate+ nightly GH Action. - Boot-time
assertFeatureConformance— coverage band read against lcov. - Docs + generator + hook rollout — this ADR (now landing),
docs/guides/coverage.md, glossary entries,pnpm turbo gen featuretemplate update,.claude/hooks/prompt-context.shkeyword group.
Related
- ADR-006 — vertical-feature-packages
- ADR-011 — TDD foundation
- ADR-018 — audit-and-compliance (similar manifest-declared shape pattern)
- ADR-019 — sandcastle agent orchestration (the dispatch loop that reads
pnpm coverage:diff) - PRD
2026-05-13-coverage-architecture— implementation seed