Closes the staleness gap after the 10-commit coverage epic shipped.
Doc sync (item 1 from the user's choice):
- CLAUDE.md Quick Start: adds pnpm coverage:aggregate / coverage:diff
/ mutate to the command listing
- CLAUDE.md: new "Sibling architecture: coverage (ADR-020)" section
after the conformance gate table — captures the 4-layer table +
points at docs/guides/coverage.md + ADR-020 + says agents must run
coverage:diff before reporting complete
- AGENTS.md preamble: now lists coverage as a parallel multi-latency
quality system alongside conformance, with the same gate / latency
framing
- PRD frontmatter: status draft -> shipped + shipped date +
shipping-commits list (all 10 SHAs anchoring the trace)
- PRD findings table: each row gets a Resolution column citing the
commit that closed it; conclusion text updated to past tense
- ADR-020 implementation phasing: rewritten as a status table with
each step linked to the commit that shipped it + Boot-time
assertFeatureConformance explicitly marked Deferred with rationale
- docs/guides/coverage.md: removed "Boot wiring lands in the next
story" line; replaced with the deferral rationale + clarified
that two readers (vitest, coverage:diff) consume the manifest
Sandcastle prompts (item 2 from the user's choice):
- .sandcastle/implementer.prompt.md: new "Coverage gates" section
after the conformance-gates list, requiring `pnpm test --coverage`,
`pnpm coverage:aggregate`, and `pnpm coverage:diff` to all pass
before reporting `complete`. Machine-readable JSON shape of
coverage:diff documented (status / uncovered[] / kind enum), with
explicit instructions on how to interpret each kind. Allowlist
expansion requires justification + test.
- .sandcastle/reviewer.prompt.md: AC coverage relabeled to "AC
coverage (acceptance criteria, not test coverage)" to disambiguate;
new check #7 "Coverage gates (ADR-020)" requiring CI's
Coverage — diff (L1) step green + per-layer thresholds met +
no silent allowlist expansion + manifest band drift detection.
Effect: future agent runs through sandcastle now treat coverage as a
first-class blocking gate, parallel to conformance. PRs no longer
discover coverage failures only via CI; the implementer is required
to check before reporting done, and the reviewer is required to
verify.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
271 lines
12 KiB
Markdown
271 lines
12 KiB
Markdown
# Coverage
|
|
|
|
> **Architecture:** [ADR-020](../decisions/adr-020-coverage-architecture.md). **Glossary:** [docs/glossary.md → Coverage](../glossary.md#coverage).
|
|
|
|
The agent-first coverage architecture has four layers. This guide is the day-to-day reference for working with them.
|
|
|
|
## The four layers at a glance
|
|
|
|
| Layer | Question it answers | Command |
|
|
| ---------------------------------- | ------------------------------------------------ | ---------------------------------------------------- |
|
|
| **L0** Per-layer vitest thresholds | "Did the last test run meet the declared bands?" | `pnpm test -- --coverage` |
|
|
| **L1** Diff coverage | "Did this PR/slice cover its own changed lines?" | `pnpm coverage:diff` |
|
|
| **L2** Aggregate trend | "How is coverage trending across the repo?" | `pnpm coverage:aggregate` → `coverage/summary.json` |
|
|
| **L3** Mutation testing | "Do my tests actually assert anything?" | `pnpm mutate` _(opt-in, not in default `pnpm test`)_ |
|
|
|
|
Each layer answers a distinct question. They compose, none replaces the others.
|
|
|
|
## Single source of truth: `feature.manifest.ts`
|
|
|
|
Every feature declares its coverage expectations once, in `feature.manifest.ts`:
|
|
|
|
```ts
|
|
export const myFeatureManifest = defineFeature({
|
|
// ...
|
|
coverage: {
|
|
bands: {
|
|
baseline: { statements: 80, branches: 75, functions: 80, lines: 80 },
|
|
entities: { statements: 100, branches: 100, functions: 100, lines: 100 },
|
|
"use-cases": {
|
|
statements: 100,
|
|
branches: 95,
|
|
functions: 100,
|
|
lines: 100,
|
|
},
|
|
controllers: {
|
|
statements: 100,
|
|
branches: 95,
|
|
functions: 100,
|
|
lines: 100,
|
|
},
|
|
},
|
|
mutationTargets: ["entities", "use-cases"],
|
|
},
|
|
} as const);
|
|
```
|
|
|
|
Two readers pick this up today:
|
|
|
|
1. **`vitest.config.ts`** — uses `vitestThresholdsFromBands(DEFAULT_COVERAGE_BANDS)` from `@repo/core-shared/conformance/coverage`. Most features import `DEFAULT_COVERAGE_BANDS` directly (the manifest's `coverage` section matches the defaults). For features with custom bands, override at the vitest config too.
|
|
2. **`pnpm coverage:diff`** — uses the bands for per-path expectations against the merged lcov.
|
|
|
|
(A third reader, a boot-time `assertFeatureConformance` coverage check, was specified in the PRD and explicitly deferred per ADR-020 — when both readers above derive from the same manifest, the drift it was supposed to catch is mechanically impossible. The manifest's `coverage:` field remains the declarative source of truth regardless of how many readers consume it.)
|
|
|
|
**Edit the manifest. The other readers pick up the change.**
|
|
|
|
## Daily workflow
|
|
|
|
### Before pushing
|
|
|
|
```bash
|
|
pnpm test -- --coverage # L0 — per-package thresholds enforced
|
|
pnpm coverage:aggregate # L2 — produce coverage/lcov.info + summary.json
|
|
pnpm coverage:diff # L1 — fails if changed lines aren't covered
|
|
```
|
|
|
|
The diff coverage step compares against `origin/main` by default. To compare against a different base:
|
|
|
|
```bash
|
|
pnpm coverage:diff -- --base HEAD~1
|
|
pnpm coverage:diff -- --base origin/release
|
|
```
|
|
|
|
For machine consumption (e.g., the agent dispatch loop):
|
|
|
|
```bash
|
|
pnpm coverage:diff -- --json | jq .uncovered
|
|
```
|
|
|
|
### Reading a failure
|
|
|
|
`pnpm coverage:diff` exits with code 1 and emits both stdout (JSON) and stderr (summary):
|
|
|
|
**stderr** (human):
|
|
|
|
```
|
|
[coverage:diff] FAIL — 3 uncovered hit(s) across 2 file(s):
|
|
packages/blog/src/application/use-cases/publish-article.use-case.ts
|
|
uncovered lines: 47, 48
|
|
packages/auth/src/entities/models/session.ts
|
|
uncovered lines: 22
|
|
```
|
|
|
|
**stdout** (JSON, also written for the dispatch loop):
|
|
|
|
```json
|
|
{
|
|
"status": "fail",
|
|
"summary": {
|
|
"filesChanged": 4,
|
|
"filesGated": 2,
|
|
"uncoveredCount": 3
|
|
},
|
|
"fileSummaries": [...],
|
|
"uncovered": [
|
|
{ "file": "...", "line": 47, "kind": "uncovered" },
|
|
{ "file": "...", "line": 48, "kind": "uncovered" },
|
|
{ "file": "...", "line": 22, "kind": "uncovered" }
|
|
]
|
|
}
|
|
```
|
|
|
|
`kind` is one of:
|
|
|
|
- `uncovered` — line is executable per lcov, execution count is 0
|
|
- `no-coverage-data` — entire file isn't in lcov (likely a new untested file)
|
|
|
|
### Fixing an uncovered slice
|
|
|
|
1. Read the JSON. For each `uncovered` hit, navigate to `<file>:<line>`.
|
|
2. Identify which test would have exercised that line. Usually it's missing a branch case or an error path.
|
|
3. Add the test (TDD: write failing test → make it green).
|
|
4. Re-run `pnpm test --coverage --filter @repo/<feature>` to verify.
|
|
5. Re-run `pnpm coverage:diff` to confirm exit 0.
|
|
|
|
For `no-coverage-data` hits, write the sibling test file — vitest's ESLint conformance rule `usecase-must-have-test-file` will start failing anyway if you don't.
|
|
|
|
### What's exempt (the allowlist)
|
|
|
|
The diff coverage gate skips:
|
|
|
|
- Test files (`*.test.ts`, `*.test.tsx`, `*.test.mjs`)
|
|
- Fixtures, factories, contracts, seeds (`__fixtures__/`, `__factories__/`, `__contracts__/`, `__seeds__/`)
|
|
- Config files (`*.config.{ts,js,mjs,cjs}`, `package.json`, `tsconfig.*.json`, `turbo.json`)
|
|
- Docs and data (`*.md`, `*.json`, `*.yaml`, `.gitignore`, `.npmrc`)
|
|
- Shell scripts (`*.sh`, `*.bash`)
|
|
- Dev tooling under `scripts/` and `turbo/generators/`
|
|
- Per-feature excludes mirrored from vitest (`di/bind-production.ts`, `application/repositories/**`, `application/services/**`, `integrations/cms/**`, `ui/**`, `*.interface.ts`, `index.ts` barrels)
|
|
- Build artifacts (`dist/`, `.next/`, `.turbo/`, `node_modules/`, `coverage/`)
|
|
|
|
The allowlist lives in `scripts/coverage/diff.mjs` and is unit-tested.
|
|
|
|
## Adjusting bands
|
|
|
|
### To raise the bar on a feature
|
|
|
|
Edit `packages/<feature>/src/feature.manifest.ts`:
|
|
|
|
```ts
|
|
coverage: {
|
|
bands: {
|
|
baseline: { statements: 90, branches: 85, functions: 90, lines: 90 }, // tighter
|
|
entities: { statements: 100, branches: 100, functions: 100, lines: 100 },
|
|
"use-cases": { statements: 100, branches: 100, functions: 100, lines: 100 }, // bumped branches
|
|
controllers: { statements: 100, branches: 95, functions: 100, lines: 100 },
|
|
},
|
|
}
|
|
```
|
|
|
|
If the new bands are stricter than the defaults, also update `packages/<feature>/vitest.config.ts` to use `vitestThresholdsFromManifest(myFeatureManifest)` instead of `DEFAULT_COVERAGE_BANDS`. _(Note: importing the manifest from a vitest config has tooling constraints — see the `DEFAULT_COVERAGE_BANDS` route as the default path.)_
|
|
|
|
### To skip a layer
|
|
|
|
Omit it from `bands`. The layer falls through to `baseline`:
|
|
|
|
```ts
|
|
coverage: {
|
|
bands: {
|
|
baseline: { ... },
|
|
entities: { ... },
|
|
// controllers omitted -> matches baseline
|
|
},
|
|
}
|
|
```
|
|
|
|
## CI behavior
|
|
|
|
`.github/workflows/ci.yml` (validate job) runs three coverage steps after the test step:
|
|
|
|
1. **Test with coverage** — produces per-package `coverage/lcov.info`
|
|
2. **Coverage — aggregate (L2)** — merges to root `coverage/lcov.info` + `coverage/summary.json`
|
|
3. **Coverage — diff (L1)** — only on pull requests, diffs against `origin/<base-ref>`
|
|
|
|
On merge to main, `.github/workflows/coverage-snapshot.yml` re-aggregates and commits the updated `coverage/summary.json` back to main. Trend history accumulates via `git log -- coverage/summary.json`.
|
|
|
|
## Reading the trend
|
|
|
|
```bash
|
|
git log --oneline --follow -- coverage/summary.json | head -10
|
|
git show <sha> -- coverage/summary.json | grep -E '"statements"|"branches"'
|
|
```
|
|
|
|
`coverage/summary.json` is the only committed coverage artifact. Each snapshot includes:
|
|
|
|
- `generatedAt` — ISO timestamp
|
|
- `commit` — short SHA
|
|
- `repo` — repo-wide percentages + raw counts
|
|
- `byPackage` — per-package percentages, keyed by `@repo/<name>`
|
|
|
|
## Mutation testing (L3)
|
|
|
|
Stryker mutation testing on `entities/` + `application/use-cases/` — the pure-business-logic surface. Not part of `pnpm test` (slow); runs on-demand and nightly via GH Action.
|
|
|
|
### Running
|
|
|
|
```bash
|
|
pnpm mutate # every feature with a stryker.config.json
|
|
pnpm mutate -- --filter @repo/auth # one feature
|
|
pnpm mutate -- --since main # incremental against base ref
|
|
pnpm mutate -- --json # machine-readable summary
|
|
```
|
|
|
|
### Configuration
|
|
|
|
Each feature has a slim `stryker.config.json` that extends the shared base:
|
|
|
|
```json
|
|
{
|
|
"$schema": "../../node_modules/@stryker-mutator/core/schema/stryker-schema.json",
|
|
"extends": "@repo/core-testing/stryker.base.json"
|
|
}
|
|
```
|
|
|
|
The base lives at `packages/core-testing/stryker.base.json` and defines:
|
|
|
|
- **Test runner**: vitest (uses each feature's `vitest.config.ts`)
|
|
- **Scope**: `src/entities/**/*.ts` and `src/application/use-cases/**/*.ts` (excludes tests/factories/contracts)
|
|
- **Thresholds**: high 90 / low 80 / break 80 (`break` is the fail threshold)
|
|
- **Reporters**: progress, html (`reports/mutation/index.html`), json (`reports/mutation/mutation.json`)
|
|
- **Incremental mode**: enabled (subsequent runs skip mutants whose source + tests haven't changed)
|
|
- **Concurrency**: 4 workers
|
|
|
|
To override per feature (rare), add fields to the feature's `stryker.config.json`:
|
|
|
|
```json
|
|
{
|
|
"extends": "@repo/core-testing/stryker.base.json",
|
|
"thresholds": { "high": 95, "low": 85, "break": 85 },
|
|
"mutate": ["src/entities/**/*.ts"]
|
|
}
|
|
```
|
|
|
|
### CI: nightly run + on-demand
|
|
|
|
`.github/workflows/mutation-nightly.yml` runs Stryker across every feature at 02:30 UTC + on `workflow_dispatch`. The dispatch UI accepts a `filter` input (e.g. `@repo/auth`) for targeted reruns. Reports uploaded as the `mutation-reports` artifact (30-day retention). On meaningful score drops it opens a tracking issue labelled `mutation-testing`.
|
|
|
|
### What you're looking for
|
|
|
|
Stryker's `mutation.json` reports the **mutation score** (killed mutants / total) per file. A surviving mutant means: the mutator changed source code (e.g., `<` → `<=`, `&&` → `||`, removed a line, etc.), reran the tests, and they STILL passed. That's a test that exists + executes the code but doesn't actually assert behavior.
|
|
|
|
Fix: read the surviving mutant's diff in `reports/mutation/index.html`, identify the assertion that should have caught it, add the assertion.
|
|
|
|
## Troubleshooting
|
|
|
|
**"Cannot find module '@vitest/coverage-v8'"** — your feature's `package.json` is missing `@vitest/coverage-v8` as a dev dep. Add it. (This was the issue surfaced for media during the L0 audit.)
|
|
|
|
**"Coverage for lines (X%) does not meet 'src/...' threshold (Y%)"** — L0 failure. Real test gap. Either write the missing test or adjust the manifest band downward (rare; band relaxation should be justified).
|
|
|
|
**`pnpm coverage:diff` says "lcov file not found"** — run `pnpm test -- --coverage && pnpm coverage:aggregate` first. The diff script reads the merged root `coverage/lcov.info`.
|
|
|
|
**`coverage/summary.json` differs every commit** — expected. It includes `generatedAt` (ISO timestamp) and `commit` (SHA). The snapshot workflow only commits it when the underlying numbers change; in local dev, regenerating it shows diff noise.
|
|
|
|
**Diff coverage flags a file I don't think should be gated** — check the allowlist in `scripts/coverage/diff.mjs`. If the file genuinely shouldn't be gated, extend the allowlist (and the tests in `diff.test.mjs`).
|
|
|
|
## Related
|
|
|
|
- [ADR-020](../decisions/adr-020-coverage-architecture.md) — full architectural rationale
|
|
- [ADR-011](../decisions/adr-011-tdd-foundation.md) — original TDD foundation (the thresholds originated here)
|
|
- [PRD 2026-05-13-coverage-architecture](../work/prds/2026-05-13-coverage-architecture.prd.md) — implementation seed with audit findings
|
|
- [docs/glossary.md](../glossary.md) — canonical vocabulary
|
|
- [docs/guides/conformance-quickref.md](./conformance-quickref.md) — sibling reference for the 5-gate conformance system
|