Convention shift: epic folders + PRD filenames + frontmatter id
fields are now bare slugs. The created: timestamp (Phase 2) carries
the date; folder names don't repeat it. A future <task-id>-<slug>
shape (e.g. ClickUp) lands cleanly when that integration ships.
Renames (git mv preserves history):
- docs/work/2026-05-13-binder-wrap-helper/
-> docs/work/binder-wrap-helper/
- docs/work/2026-05-14-library-evaluation-policy/
-> docs/work/library-evaluation-policy/
- docs/work/2026-05-14-ci-security-and-supply-chain/
-> docs/work/ci-security-and-supply-chain/
- docs/work/prds/2026-05-13-binder-wrap-helper.prd.md
-> docs/work/prds/binder-wrap-helper.prd.md
- docs/work/prds/2026-05-13-coverage-architecture.prd.md
-> docs/work/prds/coverage-architecture.prd.md
- docs/work/prds/2026-05-14-library-evaluation-policy.prd.md
-> docs/work/prds/library-evaluation-policy.prd.md
- docs/work/prds/2026-05-14-ci-security-and-supply-chain.prd.md
-> docs/work/prds/ci-security-and-supply-chain.prd.md
Frontmatter updates inside the renamed files: epic id, epic prd,
story epic, PRD id, PRD builds-on all drop date prefixes.
System folder + state file move:
- New docs/work/_system/ holds framework-managed state.
- docs/work/_state.json -> docs/work/_system/_state.json.
- state-builder.mjs adds _system to SKIP_FOLDERS.
- cli.mjs + state-sync-guard.mjs + .husky/pre-commit point at the
new path.
template-reset-v1 epic deleted entirely (one-off cleanup epic from
the pre-date-convention era; status was already done).
Generator-template updates (so new artifacts ship in the right
shape):
- .sandcastle/decomposer.prompt.md emits bare-slug folder names +
ISO created: timestamp.
- .claude/skills/to-prd/SKILL.md template uses bare-slug filename +
bare-slug id field + ISO created: timestamp.
Doc reference updates: glossary, runbook, agent-first-workflow-
and-conformance, reviewer prompt, ADR-020, ADR-022, ADR-023 all
point at the new paths/slugs.
Instructs the reviewer agent to inspect Socket critical findings and
CodeQL error-severity findings via gh run view before issuing a verdict.
Composes with the existing library-trace check — all three must pass for
approval.
Adds `--staged-against <base>` CLI flag to `check.mjs` so the reviewer
agent can compare `git diff <base>...HEAD` instead of the git index.
This gives the sandcastle reviewer a CI-compatible code path that works
in its clean sandbox where `git diff --cached` may be empty.
Appends a "Library-trace check" section to `.sandcastle/reviewer.prompt.md`
instructing the reviewer to run the command before issuing a verdict.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sandcastle re-invokes agents up to maxIterations even when the work is
already done — the decomposer was looping 4x re-writing the same epic
on every dispatch. Two halves to the fix:
- Pass completionSignal: "<promise>COMPLETE</promise>" explicitly on
all three run() calls (decompose, implementer, reviewer). Makes the
contract visible alongside maxIterations instead of relying on
sandcastle's default.
- Append a "Signal completion (required)" section to each prompt
telling the agent to emit the literal marker as its final line when
the work is genuinely done, plus a "do NOT emit if..." list to
discourage premature signaling.
The user surfaced that the binder-wrap-helper epic's stories
decomposed into horizontal sub-steps (read 3 files → write helper
→ write test → export → typecheck → coverage), not vertical
slices. Per the glossary's slice = task = PR = commit rule, every
checkbox should land as one green commit.
.sandcastle/decomposer.prompt.md:
- New "The slice rule (non-negotiable)" section near the top
defining the three constraints every task must satisfy: one
green commit; exercises a layer; independently meaningful.
- New "Tasks that are FORBIDDEN" list naming the anti-patterns
the previous output exhibited (read a file as a task; write
test without impl; standalone gate runs; standalone export;
sub-step decomposition of a single slice).
- New "Tasks that are CORRECT" list with examples drawn from
this codebase (gen invocation, full use-case slice, per-feature
binder migration, audit emission, bindAll wiring).
- New paragraph on "Manifest-first ordering INSIDE a task" —
the 4-step ordering (manifest → contracts → red test → green
impl) is what the implementer does within one task, not a
multi-checkbox decomposition.
- Constraints section gains two new bullets:
* Prefer FEWER but FATTER tasks (one per vertical slice)
over MANY thinner sub-steps
* Self-check: imagine the commit each checkbox produces;
do all gates pass on that commit alone?
.sandcastle/reviewer.prompt.md:
- New check #8 "Slice discipline" rejecting:
* Multi-commit diffs where any intermediate commit has red
gates
* Sub-step shape that should have been separate tasks
* Incomplete slices (use case w/o DI binding, manifest
publish w/o publish site, controller w/o router wiring)
.gitignore: adds `.pnpm-store/` so a misconfigured pnpm install
that places the store inside the project doesn't stage thousands
of cache files.
The existing binder-wrap-helper stories were decomposed under the
old (unconstrained) prompt and need re-decomposing under the new
rule. That's a separate action — this commit fixes the prompts;
the existing epic stays as-is until you re-decompose.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two separate sandbox blockers surfaced when the user tried
`pnpm work decompose --execute`:
1. **Container died on exec** — our Dockerfile had:
- WORKDIR /workspace + CMD ["bash"]
- No `agent` user (sandcastle exec's as UID:GID it built with)
- node:22-bookworm-slim (missing some build deps the install
script wants)
Sandcastle expects:
- A non-root `agent` user with home at /home/agent (sandcastle
does `git config --global --add safe.directory /home/agent/workspace`,
which fails if the user doesn't exist or the container exited)
- ENTRYPOINT ["sleep", "infinity"] so the container survives
the gap between sandcastle creating it and exec'ing in
Replaced .sandcastle/Dockerfile with the shape `sandcastle init`
would generate (verified against
node_modules/@ai-hero/sandcastle/dist/InitService.js):
- node:22-bookworm (full, not slim) for build tooling
- apt-get installs git + curl + jq
- corepack-pinned pnpm@9
- ARG AGENT_UID=1000 + AGENT_GID=1000; sandcastle's
build-image passes the host's UID/GID by default
- `groupmod -o -g $AGENT_GID node` + `usermod -o ... node` —
the `-o` (non-unique) flag is required because macOS hosts
have UID:501 GID:20, and GID 20 collides with Debian's
`dialout` group in the base image (without -o, groupmod
fails with "GID '20' already exists")
- USER ${AGENT_UID}:${AGENT_GID}, then install Claude Code CLI
via the official installer
- ENV PATH includes /home/agent/.local/bin
- WORKDIR /home/agent (sandcastle overrides per-run anyway)
- ENTRYPOINT ["sleep", "infinity"] keeps the container alive
2. **"Not logged in · Please run /login"** inside the container —
Claude Code on macOS stores credentials in the Keychain, NOT in
~/.claude/.credentials.json. Sandcastle's bind-mount of ~/.claude
finds nothing usable. Documented the workaround:
- README.md "Sandcastle setup (one-time)" — macOS-specific
block with the `security find-generic-password ... > ~/.claude/.credentials.json`
one-liner + chmod 600 + the security trade-off (plaintext
file vs keychain isolation)
- docs/guides/runbook.md "Using Sandcastle → Prerequisites" —
step 3 (Authentication) gets a "macOS quirk" subsection with
the same extraction one-liner + the API-key fallback as the
alternative path
- scripts/work/{dispatch,decompose}.mjs — when the sandcastle
error matches /Not logged in|Please run \/login/ AND we're on
darwin, the dispatcher prints the keychain-extraction
commands + the API-key fallback inline above the generic
"See runbook" line, so future agents discover the fix at the
failure site
The image rebuilds clean (`pnpm exec sandcastle docker
build-image`) at ~1.95GB and the container survives sandcastle's
exec — confirmed by reaching the "Not logged in" stage (which is
the next-layer issue, not the Dockerfile issue).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the gap the user hit running `pnpm work decompose --execute`:
sandcastle errored with `Image 'sandcastle:template-vertical' not
found locally. Build it first with 'sandcastle docker build-image'`,
but neither the README nor the runbook documented this step.
README.md: new "Sandcastle setup (one-time)" section after Quick
reference. Three commands (docker info, build-image, auth) — the
minimum needed to make dispatch work. Links to the runbook for the
full lifecycle.
docs/guides/runbook.md: Prerequisites in "Using Sandcastle" grow
from 4 to 5 items. New step 2 walks through `sandcastle docker
build-image`, quotes the exact "Image not found locally" error so
agents searching for the string land on the fix, and shows the
remove-image + rebuild flow for Dockerfile edits.
.sandcastle/README.md: new "Build the sandbox image (one-time)"
section parallel to the env section, cross-linking to the runbook.
scripts/work/decompose.mjs + scripts/work/dispatch.mjs: when the
sandcastle error message matches the "Image '.+' not found locally"
pattern, the dispatcher now prints the build-image command inline
above the generic "See runbook" line. The error stack itself remains
unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the PRD-lifecycle gap surfaced by the user: when sandcastle
finishes an epic's last task, the seed PRD should auto-flip from
approved -> shipped. Builds the mechanism, wires it into the work
CLI + state index + reviewer prompt + docs.
scripts/work/prd-ship.mjs (new):
- parseFrontmatter / serializeFrontmatter — minimal YAML-ish parser
sufficient for PRD frontmatter (scalar + list shapes)
- flipPrdStatus — pure function: takes PRD text, returns new text
with status=shipped + shipped=<date> + optional shipping-commits.
Refuses to flip draft, idempotent fail-soft on already-shipped,
rejects unexpected statuses
- deriveShippingCommits — best-effort git log of the linked epic
folder for the --auto-commits flag
- findPrdPath — id -> path lookup under docs/work/prds/
- runCli — wiring for `pnpm work prd-ship <id> [--commits|--auto-commits]`
scripts/work/prd-ship.test.mjs (new, 17 tests):
- Frontmatter parser handles scalars + lists + missing frontmatter
- flipPrdStatus covers all transitions + refusals + body/key preservation
- findPrdPath + serializeFrontmatter coverage
scripts/work/state-builder.mjs:
- Epic entries gain a `prd` field
- New computeNeedsPrdShip surfaces epics done with PRD status not yet
shipped: state.needs_prd_ship[] with action commands
scripts/work/cli.mjs:
- New subcommand `pnpm work prd-ship <id>`
.sandcastle/reviewer.prompt.md:
- "Epic close-out: PRD status flip" section instructing reviewer to
check _state.json.needs_prd_ship and run the suggested action
- JSON output extends with prd_shipped: "<id>" | null
docs/work/README.md:
- "PRD lifecycle" section documenting the 4 statuses + auto-flip
Future PRDs follow the lifecycle automatically: decomposer refuses
draft, human flips to approved, sandcastle ships the epic, reviewer
runs prd-ship on the final task, PRD lands as shipped with its
commit trail.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the staleness gap after the 10-commit coverage epic shipped.
Doc sync (item 1 from the user's choice):
- CLAUDE.md Quick Start: adds pnpm coverage:aggregate / coverage:diff
/ mutate to the command listing
- CLAUDE.md: new "Sibling architecture: coverage (ADR-020)" section
after the conformance gate table — captures the 4-layer table +
points at docs/guides/coverage.md + ADR-020 + says agents must run
coverage:diff before reporting complete
- AGENTS.md preamble: now lists coverage as a parallel multi-latency
quality system alongside conformance, with the same gate / latency
framing
- PRD frontmatter: status draft -> shipped + shipped date +
shipping-commits list (all 10 SHAs anchoring the trace)
- PRD findings table: each row gets a Resolution column citing the
commit that closed it; conclusion text updated to past tense
- ADR-020 implementation phasing: rewritten as a status table with
each step linked to the commit that shipped it + Boot-time
assertFeatureConformance explicitly marked Deferred with rationale
- docs/guides/coverage.md: removed "Boot wiring lands in the next
story" line; replaced with the deferral rationale + clarified
that two readers (vitest, coverage:diff) consume the manifest
Sandcastle prompts (item 2 from the user's choice):
- .sandcastle/implementer.prompt.md: new "Coverage gates" section
after the conformance-gates list, requiring `pnpm test --coverage`,
`pnpm coverage:aggregate`, and `pnpm coverage:diff` to all pass
before reporting `complete`. Machine-readable JSON shape of
coverage:diff documented (status / uncovered[] / kind enum), with
explicit instructions on how to interpret each kind. Allowlist
expansion requires justification + test.
- .sandcastle/reviewer.prompt.md: AC coverage relabeled to "AC
coverage (acceptance criteria, not test coverage)" to disambiguate;
new check #7 "Coverage gates (ADR-020)" requiring CI's
Coverage — diff (L1) step green + per-layer thresholds met +
no silent allowlist expansion + manifest band drift detection.
Effect: future agent runs through sandcastle now treat coverage as a
first-class blocking gate, parallel to conformance. PRs no longer
discover coverage failures only via CI; the implementer is required
to check before reporting done, and the reviewer is required to
verify.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>