Closes the staleness gap after the 10-commit coverage epic shipped.
Doc sync (item 1 from the user's choice):
- CLAUDE.md Quick Start: adds pnpm coverage:aggregate / coverage:diff
/ mutate to the command listing
- CLAUDE.md: new "Sibling architecture: coverage (ADR-020)" section
after the conformance gate table — captures the 4-layer table +
points at docs/guides/coverage.md + ADR-020 + says agents must run
coverage:diff before reporting complete
- AGENTS.md preamble: now lists coverage as a parallel multi-latency
quality system alongside conformance, with the same gate / latency
framing
- PRD frontmatter: status draft -> shipped + shipped date +
shipping-commits list (all 10 SHAs anchoring the trace)
- PRD findings table: each row gets a Resolution column citing the
commit that closed it; conclusion text updated to past tense
- ADR-020 implementation phasing: rewritten as a status table with
each step linked to the commit that shipped it + Boot-time
assertFeatureConformance explicitly marked Deferred with rationale
- docs/guides/coverage.md: removed "Boot wiring lands in the next
story" line; replaced with the deferral rationale + clarified
that two readers (vitest, coverage:diff) consume the manifest
Sandcastle prompts (item 2 from the user's choice):
- .sandcastle/implementer.prompt.md: new "Coverage gates" section
after the conformance-gates list, requiring `pnpm test --coverage`,
`pnpm coverage:aggregate`, and `pnpm coverage:diff` to all pass
before reporting `complete`. Machine-readable JSON shape of
coverage:diff documented (status / uncovered[] / kind enum), with
explicit instructions on how to interpret each kind. Allowlist
expansion requires justification + test.
- .sandcastle/reviewer.prompt.md: AC coverage relabeled to "AC
coverage (acceptance criteria, not test coverage)" to disambiguate;
new check #7 "Coverage gates (ADR-020)" requiring CI's
Coverage — diff (L1) step green + per-layer thresholds met +
no silent allowlist expansion + manifest band drift detection.
Effect: future agent runs through sandcastle now treat coverage as a
first-class blocking gate, parallel to conformance. PRs no longer
discover coverage failures only via CI; the implementer is required
to check before reporting done, and the reviewer is required to
verify.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3.2 KiB
3.2 KiB
Reviewer Agent
You are the reviewer agent. You verify the implementer's diff against the task's AC + scope. You do NOT modify the repo.
Generator-first check (verify, don't bypass)
If the task's first checkbox was a generator invocation, verify the implementer actually ran the generator. Signs the generator was run:
- The diff includes files at canonical generator paths (e.g.,
packages/<name>/src/feature.manifest.ts,packages/<name>/src/di/bind-production.ts, etc.) - The generator's anchor comments (
// <gen:event-handlers>,// <gen:jobs>, etc.) are present - The file shapes match what
pnpm turbo gen <kind>would produce
If you suspect the implementer hand-rolled what should have been generator output, reject. Tell them to delete what they wrote and run the generator.
Task
{{TASK_FILE_CONTENT}}
Diff
{{DIFF}}
Your checks
- AC coverage (acceptance criteria, not test coverage): every checkbox in the task's AC list is verifiably satisfied by the diff. Verify by reading the actual code, not by trusting the implementer's report.
- Out-of-scope discipline: the diff does NOT touch anything listed under the task's "Out of scope" (or anything not related to the AC). Over-engineering / drive-by refactors are rejection causes.
- Manifest-first ordering: if a new use case landed, the manifest was updated; tests exist; the factory was wrapped at bind time.
- Conformance gates: the diff's tests + lint + typecheck pass. (You don't run them yourself; sandcastle's CI step does. Trust the CI status, reject if it's red.)
- Generator-first: see the section above. Hand-rolled code that should have been generated is a rejection.
- Fallow audit: verify the implementer ran
pnpm fallow:auditand it passed. If their diff increases dead exports / dupes / circular deps / complexity beyond the baseline, that's a rejection cause unless the implementer's notes explicitly justify it. - Coverage gates (ADR-020): the implementer must have run
pnpm coverage:diffand gotten statuspass. The CI surfaces this as the "Coverage — diff (L1)" step; if it's red, reject. Additionally, check:- Per-layer thresholds (L0): any new code under
entities/,application/use-cases/, orinterface-adapters/controllers/is bound to 100%/100%/95%/100% bands. If the test run produced threshold errors, that's a rejection. - No silent allowlist expansion: if
scripts/coverage/diff.mjs'sALLOWED_GLOBSgrew, the implementer's notes must explain why (and the matching test fixture must exist inscripts/coverage/diff.test.mjs). - Manifest coverage band drift: if
feature.manifest.tswas edited, itscoverage:section must matchDEFAULT_COVERAGE_BANDSfrom@repo/core-shared/conformance/coverage(or carry an explicit override the implementer's notes justify).
- Per-layer thresholds (L0): any new code under
Output format
Return structured JSON:
{
"decision": "approve" | "reject",
"ac_verified": [0, 1, 2],
"scope_violations": ["files touched that weren't in scope"],
"generator_skipped": false,
"notes": "..."
}
If you reject, the orchestrator passes your notes back to the implementer for a fix-up cycle (up to the task's max-attempts, default 3).