# ADR-020 — Agent-first coverage architecture (4 layers + manifest-driven thresholds) **Status:** Accepted **Date:** 2026-05-13 **Builds on:** ADR-006 (vertical-feature-packages), ADR-011 (TDD foundation) **PRD:** docs/work/prds/2026-05-13-coverage-architecture.prd.md ## Context ADR-011 established the TDD foundation: per-package vitest configs with V8 coverage, a coverage baseline (80/75/80/80) and stricter per-layer bands (100% on `entities/`, `application/use-cases/`, `interface-adapters/controllers/`). The ESLint conformance rule `usecase-must-have-test-file` enforces _that a test file exists_. CI runs `pnpm test -- --coverage` and uploads `**/coverage/lcov.info` as an artifact. This leaves five gaps that matter especially in an agent-first repo: 1. **No diff-coverage gate.** A slice can ship without exercising its new lines. The ESLint rule only checks file presence; the per-package thresholds catch only large drops. 2. **No aggregate visibility.** N separate lcov files; no merged view, no trend over time. 3. **Threshold declarations are duplicated** across 5+ `vitest.config.ts` files. Drift is mechanical to spot (we did it during the 2026-05-13 brainstorm: `@repo/media` had no `coverage:` block at all; `@repo/navigation` failed its declared layer thresholds in entities + controllers). 4. **100% coverage with weak assertions is invisible.** Coverage doesn't measure whether tests would catch real regressions. Mutation testing — explicitly deferred in ADR-011 — is the next signal. 5. **Coverage data isn't agent-readable.** The HTML report serves humans; the dispatch loop has no way to ask "did my slice cover its diff?". ## Decision **1. Adopt a 4-layer coverage architecture** mirroring the 5-gate conformance philosophy (multi-latency, machine-readable, agent-first): | Layer | Catches | Latency | Surface | | ---------------------------------- | ---------------------------------------------------- | ------------------ | ------------------------------------------------------------ | | **L0** Per-layer vitest thresholds | Drift below declared bands | ~5–30s per package | `pnpm test --coverage` (existing) | | **L1** Diff coverage | Changed line not exercised | ~5s after L0 | `pnpm coverage:diff`; CI gate; dispatch post-task | | **L2** Aggregate trend | Drift across the codebase over time | ~10s | `pnpm coverage:aggregate`; committed `coverage/summary.json` | | **L3** Mutation testing | Tests that exist + execute the code + assert nothing | Minutes | `pnpm mutate`; on-demand, not default `pnpm test` | Each layer answers a distinct question; none replaces the others. **2. Make `feature.manifest.ts` the single source of truth for coverage expectations.** A new `coverage:` section per feature: ```ts coverage: { bands: { "entities": { statements: 100, branches: 100, functions: 100, lines: 100 }, "use-cases": { statements: 100, branches: 95, functions: 100, lines: 100 }, "controllers": { statements: 100, branches: 95, functions: 100, lines: 100 }, baseline: { statements: 80, branches: 75, functions: 80, lines: 80 }, }, mutationTargets: ["entities", "use-cases"], } ``` Three readers consume the manifest: - **Vitest** — `vitest.config.ts` imports the manifest's `coverage` and emits its `thresholds`. The duplicated block in 5 per-feature vitest configs goes away. - **`assertFeatureConformance`** — reads `coverage/lcov.info` for the package at boot and asserts each band. Graceful degradation in `USE_DEV_SEED=true` (warns rather than throws when lcov is absent). - **`pnpm coverage:diff`** — uses `baseline` for uncategorized files; stricter layer bands override per matching path glob. This eliminates the duplication that caused the `@repo/media` drift and centralizes one decision in one place per feature. **3. Diff coverage is cover-the-diff, not cover-the-new-code.** Every changed _executable_ line must have execution-count > 0 against the merged lcov. Modified-but-not-new lines count too — catches "agent edited code, didn't update the test." Allowlist: `*.test.ts`, `*.config.*`, `*.md`, `*.json`, `*.mjs`, plus the per-package exclude lists. **4. Aggregate trend ships in-tree, not via SaaS.** `coverage/summary.json` is committed on merge to main. Trend readable via `git log -- coverage/summary.json`. No external service dependency; the dispatch loop can read history without a network call. **5. Mutation testing is opt-in and narrowly scoped.** Stryker with `@stryker-mutator/vitest-runner`, runs on `entities/` + `application/use-cases/` only. Default mutation-score threshold 80% per feature (tunable per-manifest). Not part of `pnpm test`. Nightly GH Action surfaces score drift > 5%. **6. Output format is machine-first.** `pnpm coverage:diff` emits JSON to stdout; human-readable summary to stderr. The dispatch loop reads stdout; humans read stderr or the HTML report. ## Alternatives considered - **Codecov / Coveralls SaaS.** Polished PR comments and trend dashboards, free for OSS. Rejected as the _primary_ L2 store — adds an external dep, makes the dispatch loop dependent on a network call, and the PR-comment UX targets humans (not the primary consumer of this signal). Can be added later as gold-plating without disturbing the architecture. - **Cover-the-new-code instead of cover-the-diff.** Lighter touch; ignores modified lines. Rejected — catches less drift. A slice that edits a use case without updating its test should fail, and cover-the-new-code wouldn't notice. - **Keep thresholds in per-package vitest configs.** Status quo. Rejected — the 2026-05-13 audit found drift in 2 of 5 features (media had no block at all; navigation's block diverged subtly from the canonical). Manifest centralization is the only durable fix. - **Run mutation testing in default `pnpm test`.** Rejected — Stryker on entities + use-cases takes minutes. Adding minutes to the default loop violates the constraint ("new gates must not add more than ~30s wall time"). On-demand is the right cadence; nightly catches drift. - **Mutation testing across all layers.** Rejected for v1 — repository/controller/integration code has too many environmental dependencies to mutate cleanly. Start narrow; expand if signal is high. - **Use ESLint or fallow for diff coverage.** Rejected — diff coverage needs runtime data (which lines actually executed), not AST or filesystem state. It belongs alongside `pnpm test`, not in `pnpm lint` or `pnpm fallow`. - **Boot-time coverage assertion is too heavy.** Considered. Counter-argument: the assertion is `O(features × lcov-file-size)` — small numbers, ~200ms. The graceful-degradation in dev mode means contributors aren't blocked. The payoff — coverage drift caught at the same latency as TypeScript brands — justifies the machinery. ## Consequences **Positive:** - Every PR/task is gated on covering its own diff. Agent shipping an untested slice becomes mechanically impossible at the CI step. - One source of truth per feature for coverage expectations. The `@repo/media`-style "no coverage block" drift can't recur. - Trend history lives in the repo. `git log -- coverage/summary.json` answers "how has coverage moved over the last quarter?" without leaving the editor. - Mutation testing on the highest-leverage layers (entities + use-cases — the pure-business-logic surface) raises the floor on test quality without slowing the dispatch loop. - Machine-readable diff-coverage output integrates directly with the dispatch loop's post-task verification, completing the agent-first observability story. - Coverage joins the 5-gate conformance philosophy as a first-class signal; ADR-020 becomes the row alongside TS brands / ESLint / boot / `pnpm conformance` / fallow / coverage. **Negative:** - Implementation surface is non-trivial: 6–8 stories spanning manifest schema, vitest auto-derive, two new scripts, boot-time assertion, mutation tooling, ADR + guide + glossary + generator + hook updates. - The boot-time assertion adds a small dependency on `coverage/lcov.info` existing. Graceful degradation in dev mode handles this, but the implementation needs care. - `coverage/summary.json` committed on merge introduces a small CI permissions surface (`contents: write`) gated to the main-branch workflow. - Mutation testing is slow. The nightly cadence is the compromise; on-demand `pnpm mutate` is opt-in but rare in practice. ## Implementation phasing Shipped as a single epic over 10 commits on 2026-05-13. Per-step state: | # | Step | Commit | Status | | --- | ----------------------------------------------------------------- | --------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | 1 | Manifest schema + helper + auth proof-of-concept | `f7baa8b` | ✅ Shipped | | 2 | Vitest auto-derive (helper + `DEFAULT_COVERAGE_BANDS`) | `f7baa8b` + rollouts | ✅ Shipped (all 5 features wired) | | 3 | L1 diff coverage (`scripts/coverage/diff.mjs`) | `412d994` | ✅ Shipped | | 4 | L2 aggregate (`scripts/coverage/aggregate.mjs` + `summary.json`) | `bd5a077` | ✅ Shipped | | 5 | CI integration (validate gate + snapshot workflow) | `39e33eb` | ✅ Shipped | | 6 | Helper rollout to blog + marketing-pages | `15db9c4` | ✅ Shipped | | 7 | Docs + generator + hook rollout | `4dce1df` + `f4254aa` | ✅ Shipped | | 8 | L3 mutation testing (Stryker + nightly Action) | `6428f10` | ✅ Shipped (auth proof-of-concept; other features can add `stryker.config.json` by `extends: "@repo/core-testing/stryker.base.json"`) | | 9 | L0 unification (close test gaps in nav + media + marketing-pages) | `bf0b049` | ✅ Shipped — all 5 features hit declared bands | | 10 | Boot-time `assertFeatureConformance` coverage check | — | ⏸ Deferred. Duplicates L0's structural enforcement when both readers derive from the same manifest source of truth; the drift it was supposed to catch is mechanically impossible. Revisit if a concrete need emerges. | Repo-wide state at shipping (`coverage/summary.json`): statements 95.87% / branches 88.91% / functions 100% / lines 95.87%. All five features pass their declared 100%/100%/95%/100% bands on entities/use-cases/controllers. ## Related - ADR-006 — vertical-feature-packages - ADR-011 — TDD foundation - ADR-018 — audit-and-compliance (similar manifest-declared shape pattern) - ADR-019 — sandcastle agent orchestration (the dispatch loop that reads `pnpm coverage:diff`) - PRD `2026-05-13-coverage-architecture` — implementation seed