diff --git a/.sandcastle/implementer.prompt.md b/.sandcastle/implementer.prompt.md index 1fbefa9..7ae29c6 100644 --- a/.sandcastle/implementer.prompt.md +++ b/.sandcastle/implementer.prompt.md @@ -39,13 +39,34 @@ The generator handles step 1 + 2 for you when scaffolding a new feature. ``` pnpm typecheck # TS brand-slot enforcement, 0s pnpm lint # ESLint rules incl. conformance/* — <1s -pnpm test --filter @repo/ # tests for the feature you touched +pnpm test --filter @repo/ -- --coverage # tests + per-layer thresholds for the feature you touched pnpm conformance # cross-feature event closure pnpm fallow:audit # whole-codebase analysis: dead exports, dupes, circular deps, complexity ``` All five pass before you commit. If any fail, fix or report BLOCKED — do not paper over. +## Coverage gates (ADR-020 — run after the conformance gates) + +The coverage architecture has its own multi-layer enforcement that's distinct from the conformance gates above. Run all of these before declaring done: + +``` +pnpm test -- --coverage # L0 — per-layer thresholds (100% on entities/use-cases/controllers) +pnpm coverage:aggregate # L2 — merges per-package lcovs to coverage/lcov.info + coverage/summary.json +pnpm coverage:diff -- --base # L1 — cover-the-diff: every changed line must be exercised +``` + +Treat `pnpm coverage:diff` output as machine-readable: + +- Exit 0 → pass; the JSON stdout has `status: "pass"` +- Exit 1 → fail; the JSON stdout's `uncovered` array lists each `{ file, line, kind }` hit +- `kind: "uncovered"` → write the missing test +- `kind: "no-coverage-data"` → entire file isn't in lcov; you shipped untested code (a sibling test file is missing) + +Fix every hit before reporting `complete`. If you legitimately can't (e.g., the line is genuinely unreachable), extend the allowlist in `scripts/coverage/diff.mjs` AND add a test in `scripts/coverage/diff.test.mjs` — don't silently bypass. + +See `docs/guides/coverage.md` for the full architecture (4 layers) and the troubleshooting section. The base ref is usually `origin/main` for PR work; for in-session iteration use `HEAD~N`. + ## Commit message format `(): ` diff --git a/.sandcastle/reviewer.prompt.md b/.sandcastle/reviewer.prompt.md index 5fe2330..ae02e58 100644 --- a/.sandcastle/reviewer.prompt.md +++ b/.sandcastle/reviewer.prompt.md @@ -26,12 +26,16 @@ If you suspect the implementer hand-rolled what should have been generator outpu ## Your checks -1. **AC coverage**: every checkbox in the task's AC list is verifiably satisfied by the diff. Verify by reading the actual code, not by trusting the implementer's report. +1. **AC coverage** (acceptance criteria, not test coverage): every checkbox in the task's AC list is verifiably satisfied by the diff. Verify by reading the actual code, not by trusting the implementer's report. 2. **Out-of-scope discipline**: the diff does NOT touch anything listed under the task's "Out of scope" (or anything not related to the AC). Over-engineering / drive-by refactors are rejection causes. 3. **Manifest-first ordering**: if a new use case landed, the manifest was updated; tests exist; the factory was wrapped at bind time. 4. **Conformance gates**: the diff's tests + lint + typecheck pass. (You don't run them yourself; sandcastle's CI step does. Trust the CI status, reject if it's red.) 5. **Generator-first**: see the section above. Hand-rolled code that should have been generated is a rejection. 6. **Fallow audit**: verify the implementer ran `pnpm fallow:audit` and it passed. If their diff increases dead exports / dupes / circular deps / complexity beyond the baseline, that's a rejection cause unless the implementer's notes explicitly justify it. +7. **Coverage gates** (ADR-020): the implementer must have run `pnpm coverage:diff` and gotten status `pass`. The CI surfaces this as the "Coverage — diff (L1)" step; if it's red, reject. Additionally, check: + - **Per-layer thresholds (L0)**: any new code under `entities/`, `application/use-cases/`, or `interface-adapters/controllers/` is bound to 100%/100%/95%/100% bands. If the test run produced threshold errors, that's a rejection. + - **No silent allowlist expansion**: if `scripts/coverage/diff.mjs`'s `ALLOWED_GLOBS` grew, the implementer's notes must explain why (and the matching test fixture must exist in `scripts/coverage/diff.test.mjs`). + - **Manifest coverage band drift**: if `feature.manifest.ts` was edited, its `coverage:` section must match `DEFAULT_COVERAGE_BANDS` from `@repo/core-shared/conformance/coverage` (or carry an explicit override the implementer's notes justify). ## Output format diff --git a/AGENTS.md b/AGENTS.md index d292fe4..0c3f1e5 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -13,7 +13,12 @@ This template assumes agents (Claude, Codex, etc.) will author most feature work - `pnpm work dispatch --execute` — invoke sandcastle (requires `ANTHROPIC_API_KEY`) - `.sandcastle/` — 5 prompt templates (PRD eliciter, ADR eliciter, decomposer, implementer, reviewer); all enforce **generator-first** (`pnpm turbo gen ` over hand-rolling) -Every feature has a `src/feature.manifest.ts` declaring its use cases. Every `bindProductionX(ctx)` and `bindDevSeedX(ctx)` self-asserts at its tail via `assertFeatureConformance(...)`. Five conformance gates catch drift at four latency tiers: TypeScript (0s), ESLint (<1s), boot (~3s), CI (`pnpm conformance`, `pnpm fallow`). See `docs/guides/conformance-quickref.md` and `docs/guides/runbook.md` for the full workflow. +Every feature has a `src/feature.manifest.ts` declaring its use cases AND its coverage bands. Every `bindProductionX(ctx)` and `bindDevSeedX(ctx)` self-asserts at its tail via `assertFeatureConformance(...)`. Quality is enforced by two parallel multi-latency systems: + +- **Conformance** (5 gates) — TypeScript brands (0s), ESLint (<1s), boot (~3s), `pnpm conformance` (~120s), `pnpm fallow` (~30–60s). Catches manifest↔code drift. See `docs/guides/conformance-quickref.md`. +- **Coverage** (4 layers, ADR-020) — L0 vitest thresholds, L1 `pnpm coverage:diff` (cover-the-diff gate), L2 `pnpm coverage:aggregate` → committed `coverage/summary.json`, L3 `pnpm mutate` (nightly). The manifest's `coverage.bands` is the single source of truth. See `docs/guides/coverage.md`. + +See `docs/guides/runbook.md` for the full workflow. --- diff --git a/CLAUDE.md b/CLAUDE.md index 7e74720..2eb5e71 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -12,6 +12,9 @@ pnpm lint # ESLint (incl. 8 conformance/* rules) pnpm conformance # Cross-feature event closure pnpm fallow # Whole-codebase: dead exports, dupes, complexity pnpm fallow:audit # AI-change audit (run before commits) +pnpm coverage:aggregate # Merge per-package lcovs -> coverage/lcov.info + summary.json (L2) +pnpm coverage:diff # Cover-the-diff gate; JSON to stdout (L1, ADR-020) +pnpm mutate # Stryker mutation testing on entities + use-cases (L3, on-demand) pnpm turbo boundaries # Workspace dependency graph pnpm work status # docs/work/ epic + story state pnpm work next # Next ready story @@ -72,6 +75,19 @@ The five conformance ESLint rules: `feature-must-have-manifest` (error), `usecas See `docs/architecture/agent-first-workflow-and-conformance.md` for the full design and `docs/guides/conformance-quickref.md` for the day-to-day reference. +### Sibling architecture: coverage (ADR-020) + +Coverage runs in parallel to the 5-gate conformance system above — same multi-latency philosophy, different signal. Each feature's `feature.manifest.ts` declares a `coverage.bands` section that vitest (test-time), `pnpm coverage:diff` (CI/agent-loop), and `pnpm mutate` (nightly) all read from. Four layers: + +| Layer | Catches | Surface | +| ---------------------------------- | ------------------------------------------------------------------- | ------------------------------------------------------------- | +| **L0** Per-layer vitest thresholds | Drift below declared bands (entities/use-cases/controllers at 100%) | `pnpm test -- --coverage` | +| **L1** Diff coverage | Changed line not exercised by tests | `pnpm coverage:diff` — CI-gated on PRs + dispatch post-task | +| **L2** Aggregate trend | Codebase coverage drifted over time | `pnpm coverage:aggregate` → committed `coverage/summary.json` | +| **L3** Mutation testing | Tests that exist + execute the code but assert nothing | `pnpm mutate` — on-demand + nightly GH Action | + +See `docs/guides/coverage.md` for the cookbook and ADR-020 for the full rationale. Agents running in sandcastle: run `pnpm coverage:diff` before reporting `complete` — the implementer and reviewer prompts enforce this. + ## Key Conventions - **Relative imports in `src/`** — Source files use relative paths (`../repositories/...`), not `@/` alias diff --git a/docs/decisions/adr-020-coverage-architecture.md b/docs/decisions/adr-020-coverage-architecture.md index 052f62b..5ec07ef 100644 --- a/docs/decisions/adr-020-coverage-architecture.md +++ b/docs/decisions/adr-020-coverage-architecture.md @@ -90,16 +90,22 @@ This eliminates the duplication that caused the `@repo/media` drift and centrali ## Implementation phasing -In order (each landable independently): +Shipped as a single epic over 10 commits on 2026-05-13. Per-step state: -1. **L0 unification** — fix `@repo/media` (missing dep + missing config block) and `@repo/navigation` (real test gaps) so every feature passes its declared bands today. -2. **Manifest schema** — extend `feature.manifest.ts` shape (Zod schema in `core-shared/conformance/`) with the `coverage:` section. -3. **Vitest auto-derive** — `vitest.config.ts` per feature imports the manifest and emits `coverage.thresholds`. Eliminates duplication. -4. **L1 diff coverage** — `scripts/coverage/diff.mjs` + `pnpm coverage:diff` script + CI gate. -5. **L2 aggregate** — `scripts/coverage/aggregate.mjs` + `pnpm coverage:aggregate` + summary.json + merge-to-main workflow. -6. **L3 mutation testing** — Stryker setup + `pnpm mutate` + nightly GH Action. -7. **Boot-time `assertFeatureConformance`** — coverage band read against lcov. -8. **Docs + generator + hook rollout** — this ADR (now landing), `docs/guides/coverage.md`, glossary entries, `pnpm turbo gen feature` template update, `.claude/hooks/prompt-context.sh` keyword group. +| # | Step | Commit | Status | +| --- | ----------------------------------------------------------------- | --------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| 1 | Manifest schema + helper + auth proof-of-concept | `f7baa8b` | ✅ Shipped | +| 2 | Vitest auto-derive (helper + `DEFAULT_COVERAGE_BANDS`) | `f7baa8b` + rollouts | ✅ Shipped (all 5 features wired) | +| 3 | L1 diff coverage (`scripts/coverage/diff.mjs`) | `412d994` | ✅ Shipped | +| 4 | L2 aggregate (`scripts/coverage/aggregate.mjs` + `summary.json`) | `bd5a077` | ✅ Shipped | +| 5 | CI integration (validate gate + snapshot workflow) | `39e33eb` | ✅ Shipped | +| 6 | Helper rollout to blog + marketing-pages | `15db9c4` | ✅ Shipped | +| 7 | Docs + generator + hook rollout | `4dce1df` + `f4254aa` | ✅ Shipped | +| 8 | L3 mutation testing (Stryker + nightly Action) | `6428f10` | ✅ Shipped (auth proof-of-concept; other features can add `stryker.config.json` by `extends: "@repo/core-testing/stryker.base.json"`) | +| 9 | L0 unification (close test gaps in nav + media + marketing-pages) | `bf0b049` | ✅ Shipped — all 5 features hit declared bands | +| 10 | Boot-time `assertFeatureConformance` coverage check | — | ⏸ Deferred. Duplicates L0's structural enforcement when both readers derive from the same manifest source of truth; the drift it was supposed to catch is mechanically impossible. Revisit if a concrete need emerges. | + +Repo-wide state at shipping (`coverage/summary.json`): statements 95.87% / branches 88.91% / functions 100% / lines 95.87%. All five features pass their declared 100%/100%/95%/100% bands on entities/use-cases/controllers. ## Related diff --git a/docs/guides/coverage.md b/docs/guides/coverage.md index 2f2a505..f9b6952 100644 --- a/docs/guides/coverage.md +++ b/docs/guides/coverage.md @@ -44,11 +44,12 @@ export const myFeatureManifest = defineFeature({ } as const); ``` -Three readers pick this up: +Two readers pick this up today: 1. **`vitest.config.ts`** — uses `vitestThresholdsFromBands(DEFAULT_COVERAGE_BANDS)` from `@repo/core-shared/conformance/coverage`. Most features import `DEFAULT_COVERAGE_BANDS` directly (the manifest's `coverage` section matches the defaults). For features with custom bands, override at the vitest config too. -2. **`assertFeatureConformance`** — at app boot, reads the manifest's bands and asserts the produced lcov meets them. _(Boot wiring lands in the next story.)_ -3. **`pnpm coverage:diff`** — uses the bands for per-path expectations against the merged lcov. +2. **`pnpm coverage:diff`** — uses the bands for per-path expectations against the merged lcov. + +(A third reader, a boot-time `assertFeatureConformance` coverage check, was specified in the PRD and explicitly deferred per ADR-020 — when both readers above derive from the same manifest, the drift it was supposed to catch is mechanically impossible. The manifest's `coverage:` field remains the declarative source of truth regardless of how many readers consume it.) **Edit the manifest. The other readers pick up the change.** diff --git a/docs/work/_state.json b/docs/work/_state.json index 46f6e94..a6cb001 100644 --- a/docs/work/_state.json +++ b/docs/work/_state.json @@ -1,5 +1,5 @@ { - "updated_at": "2026-05-13T11:37:22.767Z", + "updated_at": "2026-05-13T14:47:21.408Z", "epics": { "template-reset-v1": { "status": "done", diff --git a/docs/work/prds/2026-05-13-coverage-architecture.prd.md b/docs/work/prds/2026-05-13-coverage-architecture.prd.md index 5a27378..ba812bb 100644 --- a/docs/work/prds/2026-05-13-coverage-architecture.prd.md +++ b/docs/work/prds/2026-05-13-coverage-architecture.prd.md @@ -2,10 +2,22 @@ id: 2026-05-13-coverage-architecture title: Agent-first coverage architecture (4 layers + manifest-driven thresholds) type: prd -status: draft +status: shipped author: danijel elicitation-session: brainstorm-2026-05-13 created: 2026-05-13 +shipped: 2026-05-13 +shipping-commits: + - 7eb783a (PRD) + - 4dce1df (ADR-020 + glossary + hook) + - f7baa8b (manifest schema + helper + auth) + - 412d994 (L1 coverage:diff) + - bd5a077 (L2 coverage:aggregate) + - 39e33eb (CI integration) + - 15db9c4 (helper rollout blog + marketing-pages) + - f4254aa (cookbook guide + generator) + - 6428f10 (L3 Stryker mutation) + - bf0b049 (L0 unification — all 5 features green) --- ## Problem