The previous layout placed epic folders directly under docs/work/
alongside prds/ and _system/. Tightening: epics now live in their
own docs/work/epics/ subfolder, peer to prds/ and _system/. Same
shape as the existing prds/ bucket.
Final docs/work/ layout:
README.md
prds/<slug>.prd.md
_system/_state.json
epics/<slug>/_epic.md + <story-folder>/_story.md
Renames (git mv preserves history):
- docs/work/binder-wrap-helper/
-> docs/work/epics/binder-wrap-helper/
- docs/work/library-evaluation-policy/
-> docs/work/epics/library-evaluation-policy/
- docs/work/ci-security-and-supply-chain/
-> docs/work/epics/ci-security-and-supply-chain/
Tooling updates:
- state-builder.mjs walks workRoot/epics/ directly; SKIP_FOLDERS
obsoleted (no more sibling folders to filter out).
- dispatch.mjs's findNextTask, tickStoryBulletInEpic, and
flipEpicDoneIfAllStoriesDone all join with "epics" segment.
- prd-ship.mjs's deriveShippingCommits walks workRoot/epics/ and
git-logs docs/work/epics/<epic>/.
- decomposer.prompt.md emits epics under docs/work/epics/<epic-id>/.
- handoff + grill-with-docs glossary references updated.
- Glossary entry for Epic updated.
Reserved future shape: when a task-tracker integration (ClickUp,
Linear) ships, the epics/ subfolder hosts <task-id>-<slug>/
folders. Today it just hosts bare slugs.
Convention shift: epic folders + PRD filenames + frontmatter id
fields are now bare slugs. The created: timestamp (Phase 2) carries
the date; folder names don't repeat it. A future <task-id>-<slug>
shape (e.g. ClickUp) lands cleanly when that integration ships.
Renames (git mv preserves history):
- docs/work/2026-05-13-binder-wrap-helper/
-> docs/work/binder-wrap-helper/
- docs/work/2026-05-14-library-evaluation-policy/
-> docs/work/library-evaluation-policy/
- docs/work/2026-05-14-ci-security-and-supply-chain/
-> docs/work/ci-security-and-supply-chain/
- docs/work/prds/2026-05-13-binder-wrap-helper.prd.md
-> docs/work/prds/binder-wrap-helper.prd.md
- docs/work/prds/2026-05-13-coverage-architecture.prd.md
-> docs/work/prds/coverage-architecture.prd.md
- docs/work/prds/2026-05-14-library-evaluation-policy.prd.md
-> docs/work/prds/library-evaluation-policy.prd.md
- docs/work/prds/2026-05-14-ci-security-and-supply-chain.prd.md
-> docs/work/prds/ci-security-and-supply-chain.prd.md
Frontmatter updates inside the renamed files: epic id, epic prd,
story epic, PRD id, PRD builds-on all drop date prefixes.
System folder + state file move:
- New docs/work/_system/ holds framework-managed state.
- docs/work/_state.json -> docs/work/_system/_state.json.
- state-builder.mjs adds _system to SKIP_FOLDERS.
- cli.mjs + state-sync-guard.mjs + .husky/pre-commit point at the
new path.
template-reset-v1 epic deleted entirely (one-off cleanup epic from
the pre-date-convention era; status was already done).
Generator-template updates (so new artifacts ship in the right
shape):
- .sandcastle/decomposer.prompt.md emits bare-slug folder names +
ISO created: timestamp.
- .claude/skills/to-prd/SKILL.md template uses bare-slug filename +
bare-slug id field + ISO created: timestamp.
Doc reference updates: glossary, runbook, agent-first-workflow-
and-conformance, reviewer prompt, ADR-020, ADR-022, ADR-023 all
point at the new paths/slugs.
- New scripts/work/bump-updated-timestamps.mjs stamps the `updated:`
frontmatter field to the current ISO 8601 UTC timestamp on every
staged docs/work/**/*.md file. Idempotent; adds the field after
`created:` if missing.
- .husky/pre-commit invokes the bump script as step 2 (before
rebuild-state) so _state.json sees the fresh timestamp.
- Backfill all existing work docs (4 PRDs + 3 epics + 21 stories):
* created: promoted from \`YYYY-MM-DD\` -> ISO timestamp using
git log --diff-filter=A on each file (first-commit date for
stories that had no \`created:\` line, midnight UTC for PRDs
and epics that had date-only created).
* updated: added from \`git log -1 --format=%aI\` on each file
(last-commit timestamp); will be re-stamped to "now" by the
pre-commit hook on this commit.
Stories that had no \`created:\` line now get one.
- Add tickStoryBulletInEpic(workRoot, epic, story) helper that finds
the bullet in the parent epic's `## Stories` section linking to the
given story folder and flips `- [ ]` to `- [x]`. Idempotent.
- applyApprovedState now ticks the parent epic bullet whenever a
story flips to status: done (alongside the existing per-task tick
and epic-status flip). Epic file gets staged on either ticked-or-
flipped, not just flipped.
- Backfill all 3 existing epics: 21 bullets ticked to match their
already-done story statuses (binder-wrap-helper x3, library-
evaluation-policy x9, ci-security-and-supply-chain x9).
Sandcastle rejects `resumeSession` when `maxIterations > 1` with
"Resume applies to iteration 1 only; multi-iteration resume
semantics are not supported." Since a TDD slice needs the full
30-iteration budget, the session-resume path we shipped in d5c0120
is dead infrastructure that breaks dispatch mid-run.
Rip it out cleanly:
- runOneSlice drops the resumeSession param + the
context-exhaustion safety net + sessionId/usage return fields
- executeDispatch drops the currentStory/currentSession bookkeeping
and the token-reset threshold
- helpers totalInputTokens + isContextExhaustedError go (only used
by the resume path)
- SANDCASTLE_SESSION_TOKEN_RESET removed from .env.example
Net: -153 lines. Each slice is again an independent sandcastle
session; token cost per slice goes up (each implementer
re-discovers context) but the multi-iteration TDD shape works.
A different cross-slice context-passing mechanism (e.g. a
story-level context summary injected into each task spec) is left
as future work.
Wires sandcastle's native `resumeSession` into the dispatch loop so
the implementer walks into task N already knowing what task N-1
discovered — repo layout, helper signatures, gate output, prior diff.
No scratchpad / no hand-curated context file; the agent's own Claude
Code conversation log is the carrier.
Three guardrails keep it bounded:
- Story boundary reset. `currentSession` is dropped whenever
findNextTask returns a different story id. New domain ≈ new
context — keeps story 03 from inheriting story 02's residue.
- Token-threshold reset. After each approved slice, sum the
implementer's last-iteration usage (inputTokens +
cacheCreationInputTokens + cacheReadInputTokens — caching saves
dollars but doesn't free window space). If above
SANDCASTLE_SESSION_TOKEN_RESET (default 140000 ≈ 70% of Sonnet
4.6's 200k), drop the session before the next task. Configurable
via env.
- Context-exhausted safety net. If the model rejects with
"prompt is too long" / "context_length_exceeded" / similar, the
retry loop drops the session and re-runs the attempt fresh
exactly once. Doesn't count against SANDCASTLE_MAX_ATTEMPTS
(different failure mode).
Reviewer always runs fresh — each approve/reject decision should be
independent of prior tasks to keep the gate honest. Within a single
slice's reject-fixup retries, the implementer also carries forward
across attempts (so attempt 2 sees attempt 1's reasoning + the
reviewer notes), but that's per-slice cumulative, not cross-slice.
runOneSlice now returns { sessionId, usage } so executeDispatch can
make the carry-or-reset decision per slice.
Previously the orchestrator ran exactly one implementer + reviewer pair,
printed "(Automatic state mutation by the orchestrator is v2.)", and
exited — the human had to tick the bullet, flip story status, rebuild
state, and re-invoke for every slice. V2 closes the loop:
- Parses the JSON the implementer + reviewer prompts ask the agents to
emit (`parseAgentJson` — tolerates both ```json fenced and bare
trailing { ... } shapes). The reviewer's `decision` and the
implementer's `status` are the orchestrator's discriminators.
- On approve: ticks the bullet in `_story.md` and writes it back. If
the story now has zero unchecked bullets, flips its frontmatter
`status: in-progress → done`; if all sibling stories are also done,
flips the epic's frontmatter the same way. Commits the mutation on
the host as a separate `chore(work): tick/finish ...` commit so the
implementer's slice commit stays clean. `_state.json` regenerates
via the existing pre-commit `rebuild-state` hook.
- On reject: re-dispatches the implementer with the reviewer's notes
appended to TASK_FILE_CONTENT, bounded by SANDCASTLE_MAX_ATTEMPTS
(default 3). On the (max+1)th reject the loop exits 1 with the last
notes printed.
- After every approved slice, calls findNextTask again and dispatches
the next ready bullet — including across story boundaries (the
state-builder treats any non-done story with satisfied deps as
ready, so flipping story 01 to done unblocks story 02 automatically).
- Flags: `--once` (legacy single-slice behavior) and `--max-tasks N`
bound the loop. Default is unlimited — matches the
continuous-execution preference.
Auth/sandbox setup is now pulled out of the per-iteration path so the
loop reuses one sandbox across slices.
Sandcastle re-invokes agents up to maxIterations even when the work is
already done — the decomposer was looping 4x re-writing the same epic
on every dispatch. Two halves to the fix:
- Pass completionSignal: "<promise>COMPLETE</promise>" explicitly on
all three run() calls (decompose, implementer, reviewer). Makes the
contract visible alongside maxIterations instead of relying on
sandcastle's default.
- Append a "Signal completion (required)" section to each prompt
telling the agent to emit the literal marker as its final line when
the work is genuinely done, plus a "do NOT emit if..." list to
discourage premature signaling.
Sandcastle's default maxIterations: 1 cut every agent off after its
first response, so files written inside the sandbox never made it
into a captured commit. The decomposer wrote 9 epic + story files,
hit the limit, and sandcastle returned 0 commits — the host saw
nothing.
decompose.mjs: maxIterations 10 (small authoring task — read
context, write files, commit). Override via env
SANDCASTLE_DECOMPOSE_ITERATIONS.
dispatch.mjs:
- Implementer: maxIterations 30 (full TDD slice — read context,
red test, green impl, run all five gates, commit). Override via
SANDCASTLE_IMPLEMENTER_ITERATIONS.
- Reviewer: maxIterations 10 (read diff + task spec, decide).
Override via SANDCASTLE_REVIEWER_ITERATIONS.
Each call site documents WHY the value was picked + names the env
override inline so tuning is discoverable from the code.
Two stale-comment fixes surfaced after the dispatch handoff fix:
cli.mjs: the top-of-file JSDoc listed only 3 of 8 subcommands
(rebuild-state, status, next) and missed ready / blocked /
dispatch / decompose / prd-ship. Rewrote the header to describe
all 8 subcommands + their flags + the explicit-runCli routing
pattern that replaces the older side-effect-on-import approach
(established when the dispatch handoff broke and got fixed in
bb643b8).
prd-ship.mjs: the JSDoc claimed allowed transitions were
"<approved|in-review|draft> -> shipped", but the code refuses
draft (throws "still draft — flip to approved (human review)
before shipping"). Corrected the doc to "<approved|in-review>
-> shipped" + clarified that draft -> approved is the human step
deliberately kept outside the command's scope.
No behaviour change — comments only.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cli.mjs's `dispatch` branch called `import("./dispatch.mjs")` and
relied on dispatch.mjs's top-level CLI block running as a side
effect of the import. The earlier guard added to dispatch.mjs (to
stop the CLI firing when sibling work scripts import
`resolveClaudeAuth`) also stopped this legit handoff — so
`pnpm work dispatch` silently exited with no output.
Fix: explicit CLI entry function, called by name. Same pattern
already in use for prd-ship + decompose.
dispatch.mjs:
- Wraps the args parsing + print/execute branch in `export async
function runCli(args)`
- The invokedDirectly guard now wraps `runCli(process.argv.slice(2))`
so direct-invocation (`node scripts/work/dispatch.mjs ...`) still
works
cli.mjs:
- Imports runCli as runDispatch
- The `cmd === "dispatch"` branch calls runDispatch(args) directly
with a .catch attached (instead of import("./dispatch.mjs"))
Verified: `pnpm work dispatch` now correctly prints the dispatch
plan for the first ready task (`binder-wrap-helper /
01-wire-use-case-helper`'s first bullet); decompose tests stay 9/9.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The dispatch.mjs + decompose.mjs error handlers grew an image-not-
found hint in cd0a332 but the macOS keychain hint that the earlier
commit's message claimed wasn't actually applied (the Edit tool
required re-reading those files post-commit).
This commit applies the keychain hint to both error handlers: when
the sandcastle error matches /Not logged in|Please run \/login/ AND
process.platform === "darwin", the dispatcher prints the
`security find-generic-password ... > ~/.claude/.credentials.json`
one-liner + chmod 600 + the API-key fallback inline above the
generic "See runbook" line.
Now future agents hitting this on macOS see the fix at the failure
site, not just in docs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the gap the user hit running `pnpm work decompose --execute`:
sandcastle errored with `Image 'sandcastle:template-vertical' not
found locally. Build it first with 'sandcastle docker build-image'`,
but neither the README nor the runbook documented this step.
README.md: new "Sandcastle setup (one-time)" section after Quick
reference. Three commands (docker info, build-image, auth) — the
minimum needed to make dispatch work. Links to the runbook for the
full lifecycle.
docs/guides/runbook.md: Prerequisites in "Using Sandcastle" grow
from 4 to 5 items. New step 2 walks through `sandcastle docker
build-image`, quotes the exact "Image not found locally" error so
agents searching for the string land on the fix, and shows the
remove-image + rebuild flow for Dockerfile edits.
.sandcastle/README.md: new "Build the sandbox image (one-time)"
section parallel to the env section, cross-linking to the runbook.
scripts/work/decompose.mjs + scripts/work/dispatch.mjs: when the
sandcastle error message matches the "Image '.+' not found locally"
pattern, the dispatcher now prints the build-image command inline
above the generic "See runbook" line. The error stack itself remains
unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the gap surfaced by the user: `pnpm work` usage referenced
`decompose` (via docs + the to-prd skill) but the subcommand was
never built. Mirrors `pnpm work dispatch`'s shape.
scripts/work/decompose.mjs (new):
- validatePrdForDecompose(prdPath) — refuses draft (must go
through human review first), in-review (review incomplete),
shipped (epic already exists); accepts only approved
- printDecomposePlan(prdId, prdPath, frontmatter) — print-mode
output showing the PRD's eligibility + sandcastle invocation
plan + auth modes
- executeDecompose(prdId, prdPath, prdText) — invokes sandcastle
with .sandcastle/decomposer.prompt.md, passing PRD_FILE_CONTENT
promptArg. The decomposer agent writes the epic + per-story
files to disk on a sandcastle branch the human can review
- runCli(args, { workRoot }) — entry point used by cli.mjs
- Direct invocation also supported (mirrors dispatch.mjs's
invokedDirectly guard, NEW pattern after this commit)
scripts/work/decompose.test.mjs (new, 9 tests, all green):
- validatePrdForDecompose: accepts approved; rejects draft,
in-review, shipped, unknown status, missing file
- runCli: writes error + returns 1 on missing PRD; writes error
+ returns 1 on draft PRD; prints plan + returns 0 on approved
scripts/work/cli.mjs:
- Adds `decompose` subcommand to usage + dispatch
- Usage formatting realigned for the 3-line subcommand block
scripts/work/dispatch.mjs:
- **Fix** the bug surfaced by the user: dispatch.mjs's CLI ran
as a top-level side effect whenever any of its exports was
imported. decompose.mjs imports resolveClaudeAuth from it, so
importing decompose.mjs printed "No ready task to dispatch."
Added an `import.meta.url === \`file://${process.argv[1]}\``
guard so the CLI only runs when invoked directly. This unblocks
cross-import without side effects.
Smoke-tested end-to-end:
- `pnpm work decompose` (no id) prints usage + exits 2
- `pnpm work decompose 2026-05-13-binder-wrap-helper` prints the
decompose plan with status: approved (eligible)
- 9/9 unit tests green
- dispatch.mjs's existing direct-invocation path unchanged
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the PRD-lifecycle gap surfaced by the user: when sandcastle
finishes an epic's last task, the seed PRD should auto-flip from
approved -> shipped. Builds the mechanism, wires it into the work
CLI + state index + reviewer prompt + docs.
scripts/work/prd-ship.mjs (new):
- parseFrontmatter / serializeFrontmatter — minimal YAML-ish parser
sufficient for PRD frontmatter (scalar + list shapes)
- flipPrdStatus — pure function: takes PRD text, returns new text
with status=shipped + shipped=<date> + optional shipping-commits.
Refuses to flip draft, idempotent fail-soft on already-shipped,
rejects unexpected statuses
- deriveShippingCommits — best-effort git log of the linked epic
folder for the --auto-commits flag
- findPrdPath — id -> path lookup under docs/work/prds/
- runCli — wiring for `pnpm work prd-ship <id> [--commits|--auto-commits]`
scripts/work/prd-ship.test.mjs (new, 17 tests):
- Frontmatter parser handles scalars + lists + missing frontmatter
- flipPrdStatus covers all transitions + refusals + body/key preservation
- findPrdPath + serializeFrontmatter coverage
scripts/work/state-builder.mjs:
- Epic entries gain a `prd` field
- New computeNeedsPrdShip surfaces epics done with PRD status not yet
shipped: state.needs_prd_ship[] with action commands
scripts/work/cli.mjs:
- New subcommand `pnpm work prd-ship <id>`
.sandcastle/reviewer.prompt.md:
- "Epic close-out: PRD status flip" section instructing reviewer to
check _state.json.needs_prd_ship and run the suggested action
- JSON output extends with prd_shipped: "<id>" | null
docs/work/README.md:
- "PRD lifecycle" section documenting the 4 statuses + auto-flip
Future PRDs follow the lifecycle automatically: decomposer refuses
draft, human flips to approved, sandcastle ships the epic, reviewer
runs prd-ship on the final task, PRD lands as shipped with its
commit trail.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>