agentic-dev-template

Author	SHA1	Message	Date
Danijel Martinek	9e7723f9a5	fix(scripts): remove broken session-resume from dispatch loop Some checks failed CI / typecheck + lint + boundaries + test + build (push) Has been cancelled Details CI / Playwright e2e (push) Has been cancelled Details CI / Storybook smoke tests + visual regression (push) Has been cancelled Details Coverage snapshot / snapshot (push) Has been cancelled Details Release Please / release-please (push) Has been cancelled Details Sentry PII guard (R31) / pii-guard (push) Has been cancelled Details Sandcastle rejects `resumeSession` when `maxIterations > 1` with "Resume applies to iteration 1 only; multi-iteration resume semantics are not supported." Since a TDD slice needs the full 30-iteration budget, the session-resume path we shipped in `d5c0120` is dead infrastructure that breaks dispatch mid-run. Rip it out cleanly: - runOneSlice drops the resumeSession param + the context-exhaustion safety net + sessionId/usage return fields - executeDispatch drops the currentStory/currentSession bookkeeping and the token-reset threshold - helpers totalInputTokens + isContextExhaustedError go (only used by the resume path) - SANDCASTLE_SESSION_TOKEN_RESET removed from .env.example Net: -153 lines. Each slice is again an independent sandcastle session; token cost per slice goes up (each implementer re-discovers context) but the multi-iteration TDD shape works. A different cross-slice context-passing mechanism (e.g. a story-level context summary injected into each task spec) is left as future work.	2026-05-14 11:48:32 +02:00
Danijel Martinek	d5c01209ea	feat(work): resume implementer session across same-story slices Wires sandcastle's native `resumeSession` into the dispatch loop so the implementer walks into task N already knowing what task N-1 discovered — repo layout, helper signatures, gate output, prior diff. No scratchpad / no hand-curated context file; the agent's own Claude Code conversation log is the carrier. Three guardrails keep it bounded: - Story boundary reset. `currentSession` is dropped whenever findNextTask returns a different story id. New domain ≈ new context — keeps story 03 from inheriting story 02's residue. - Token-threshold reset. After each approved slice, sum the implementer's last-iteration usage (inputTokens + cacheCreationInputTokens + cacheReadInputTokens — caching saves dollars but doesn't free window space). If above SANDCASTLE_SESSION_TOKEN_RESET (default 140000 ≈ 70% of Sonnet 4.6's 200k), drop the session before the next task. Configurable via env. - Context-exhausted safety net. If the model rejects with "prompt is too long" / "context_length_exceeded" / similar, the retry loop drops the session and re-runs the attempt fresh exactly once. Doesn't count against SANDCASTLE_MAX_ATTEMPTS (different failure mode). Reviewer always runs fresh — each approve/reject decision should be independent of prior tasks to keep the gate honest. Within a single slice's reject-fixup retries, the implementer also carries forward across attempts (so attempt 2 sees attempt 1's reasoning + the reviewer notes), but that's per-slice cumulative, not cross-slice. runOneSlice now returns { sessionId, usage } so executeDispatch can make the carry-or-reset decision per slice.	2026-05-13 20:13:30 +02:00
Danijel Martinek	edbc6a8fad	feat(work): dispatch loops + auto-ticks state on approve Previously the orchestrator ran exactly one implementer + reviewer pair, printed "(Automatic state mutation by the orchestrator is v2.)", and exited — the human had to tick the bullet, flip story status, rebuild state, and re-invoke for every slice. V2 closes the loop: - Parses the JSON the implementer + reviewer prompts ask the agents to emit (`parseAgentJson` — tolerates both ```json fenced and bare trailing { ... } shapes). The reviewer's `decision` and the implementer's `status` are the orchestrator's discriminators. - On approve: ticks the bullet in `_story.md` and writes it back. If the story now has zero unchecked bullets, flips its frontmatter `status: in-progress → done`; if all sibling stories are also done, flips the epic's frontmatter the same way. Commits the mutation on the host as a separate `chore(work): tick/finish ...` commit so the implementer's slice commit stays clean. `_state.json` regenerates via the existing pre-commit `rebuild-state` hook. - On reject: re-dispatches the implementer with the reviewer's notes appended to TASK_FILE_CONTENT, bounded by SANDCASTLE_MAX_ATTEMPTS (default 3). On the (max+1)th reject the loop exits 1 with the last notes printed. - After every approved slice, calls findNextTask again and dispatches the next ready bullet — including across story boundaries (the state-builder treats any non-done story with satisfied deps as ready, so flipping story 01 to done unblocks story 02 automatically). - Flags: `--once` (legacy single-slice behavior) and `--max-tasks N` bound the loop. Default is unlimited — matches the continuous-execution preference. Auth/sandbox setup is now pulled out of the per-iteration path so the loop reuses one sandbox across slices.	2026-05-13 19:43:11 +02:00
Danijel Martinek	eadbb7ebd9	fix(work): emit completion signal to stop sandcastle agent loops Sandcastle re-invokes agents up to maxIterations even when the work is already done — the decomposer was looping 4x re-writing the same epic on every dispatch. Two halves to the fix: - Pass completionSignal: "<promise>COMPLETE</promise>" explicitly on all three run() calls (decompose, implementer, reviewer). Makes the contract visible alongside maxIterations instead of relying on sandcastle's default. - Append a "Signal completion (required)" section to each prompt telling the agent to emit the literal marker as its final line when the work is genuinely done, plus a "do NOT emit if..." list to discourage premature signaling.	2026-05-13 19:11:44 +02:00
Danijel Martinek	26aa97f0ef	fix(work): bump sandcastle maxIterations so agents finish + commit Sandcastle's default maxIterations: 1 cut every agent off after its first response, so files written inside the sandbox never made it into a captured commit. The decomposer wrote 9 epic + story files, hit the limit, and sandcastle returned 0 commits — the host saw nothing. decompose.mjs: maxIterations 10 (small authoring task — read context, write files, commit). Override via env SANDCASTLE_DECOMPOSE_ITERATIONS. dispatch.mjs: - Implementer: maxIterations 30 (full TDD slice — read context, red test, green impl, run all five gates, commit). Override via SANDCASTLE_IMPLEMENTER_ITERATIONS. - Reviewer: maxIterations 10 (read diff + task spec, decide). Override via SANDCASTLE_REVIEWER_ITERATIONS. Each call site documents WHY the value was picked + names the env override inline so tuning is discoverable from the code.	2026-05-13 18:56:59 +02:00
Danijel Martinek	52b4409d94	docs(work): refresh stale JSDoc headers in cli + prd-ship Two stale-comment fixes surfaced after the dispatch handoff fix: cli.mjs: the top-of-file JSDoc listed only 3 of 8 subcommands (rebuild-state, status, next) and missed ready / blocked / dispatch / decompose / prd-ship. Rewrote the header to describe all 8 subcommands + their flags + the explicit-runCli routing pattern that replaces the older side-effect-on-import approach (established when the dispatch handoff broke and got fixed in `bb643b8`). prd-ship.mjs: the JSDoc claimed allowed transitions were "<approved\|in-review\|draft> -> shipped", but the code refuses draft (throws "still draft — flip to approved (human review) before shipping"). Corrected the doc to "<approved\|in-review> -> shipped" + clarified that draft -> approved is the human step deliberately kept outside the command's scope. No behaviour change — comments only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 18:18:42 +02:00
Danijel Martinek	bb643b8635	fix(work): dispatch CLI handoff broke after import-side-effect guard cli.mjs's `dispatch` branch called `import("./dispatch.mjs")` and relied on dispatch.mjs's top-level CLI block running as a side effect of the import. The earlier guard added to dispatch.mjs (to stop the CLI firing when sibling work scripts import `resolveClaudeAuth`) also stopped this legit handoff — so `pnpm work dispatch` silently exited with no output. Fix: explicit CLI entry function, called by name. Same pattern already in use for prd-ship + decompose. dispatch.mjs: - Wraps the args parsing + print/execute branch in `export async function runCli(args)` - The invokedDirectly guard now wraps `runCli(process.argv.slice(2))` so direct-invocation (`node scripts/work/dispatch.mjs ...`) still works cli.mjs: - Imports runCli as runDispatch - The `cmd === "dispatch"` branch calls runDispatch(args) directly with a .catch attached (instead of import("./dispatch.mjs")) Verified: `pnpm work dispatch` now correctly prints the dispatch plan for the first ready task (`binder-wrap-helper / 01-wire-use-case-helper`'s first bullet); decompose tests stay 9/9. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 18:17:47 +02:00
Danijel Martinek	9d4b801909	fix(work): wire inline macOS keychain hint into dispatch + decompose error paths The dispatch.mjs + decompose.mjs error handlers grew an image-not- found hint in `cd0a332` but the macOS keychain hint that the earlier commit's message claimed wasn't actually applied (the Edit tool required re-reading those files post-commit). This commit applies the keychain hint to both error handlers: when the sandcastle error matches /Not logged in\|Please run \/login/ AND process.platform === "darwin", the dispatcher prints the `security find-generic-password ... > ~/.claude/.credentials.json` one-liner + chmod 600 + the API-key fallback inline above the generic "See runbook" line. Now future agents hitting this on macOS see the fix at the failure site, not just in docs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 18:03:13 +02:00
Danijel Martinek	cd0a332443	docs: surface sandcastle image-build step (one-time setup) Closes the gap the user hit running `pnpm work decompose --execute`: sandcastle errored with `Image 'sandcastle:template-vertical' not found locally. Build it first with 'sandcastle docker build-image'`, but neither the README nor the runbook documented this step. README.md: new "Sandcastle setup (one-time)" section after Quick reference. Three commands (docker info, build-image, auth) — the minimum needed to make dispatch work. Links to the runbook for the full lifecycle. docs/guides/runbook.md: Prerequisites in "Using Sandcastle" grow from 4 to 5 items. New step 2 walks through `sandcastle docker build-image`, quotes the exact "Image not found locally" error so agents searching for the string land on the fix, and shows the remove-image + rebuild flow for Dockerfile edits. .sandcastle/README.md: new "Build the sandbox image (one-time)" section parallel to the env section, cross-linking to the runbook. scripts/work/decompose.mjs + scripts/work/dispatch.mjs: when the sandcastle error message matches the "Image '.+' not found locally" pattern, the dispatcher now prints the build-image command inline above the generic "See runbook" line. The error stack itself remains unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 17:51:30 +02:00
Danijel Martinek	014578c9a8	feat(work): pnpm work decompose subcommand Closes the gap surfaced by the user: `pnpm work` usage referenced `decompose` (via docs + the to-prd skill) but the subcommand was never built. Mirrors `pnpm work dispatch`'s shape. scripts/work/decompose.mjs (new): - validatePrdForDecompose(prdPath) — refuses draft (must go through human review first), in-review (review incomplete), shipped (epic already exists); accepts only approved - printDecomposePlan(prdId, prdPath, frontmatter) — print-mode output showing the PRD's eligibility + sandcastle invocation plan + auth modes - executeDecompose(prdId, prdPath, prdText) — invokes sandcastle with .sandcastle/decomposer.prompt.md, passing PRD_FILE_CONTENT promptArg. The decomposer agent writes the epic + per-story files to disk on a sandcastle branch the human can review - runCli(args, { workRoot }) — entry point used by cli.mjs - Direct invocation also supported (mirrors dispatch.mjs's invokedDirectly guard, NEW pattern after this commit) scripts/work/decompose.test.mjs (new, 9 tests, all green): - validatePrdForDecompose: accepts approved; rejects draft, in-review, shipped, unknown status, missing file - runCli: writes error + returns 1 on missing PRD; writes error + returns 1 on draft PRD; prints plan + returns 0 on approved scripts/work/cli.mjs: - Adds `decompose` subcommand to usage + dispatch - Usage formatting realigned for the 3-line subcommand block scripts/work/dispatch.mjs: - Fix the bug surfaced by the user: dispatch.mjs's CLI ran as a top-level side effect whenever any of its exports was imported. decompose.mjs imports resolveClaudeAuth from it, so importing decompose.mjs printed "No ready task to dispatch." Added an `import.meta.url === \`file://${process.argv[1]}\`` guard so the CLI only runs when invoked directly. This unblocks cross-import without side effects. Smoke-tested end-to-end: - `pnpm work decompose` (no id) prints usage + exits 2 - `pnpm work decompose 2026-05-13-binder-wrap-helper` prints the decompose plan with status: approved (eligible) - 9/9 unit tests green - dispatch.mjs's existing direct-invocation path unchanged Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 17:46:57 +02:00
Danijel Martinek	32d20872e3	feat(work): pnpm work prd-ship + auto-flip integration in sandcastle Closes the PRD-lifecycle gap surfaced by the user: when sandcastle finishes an epic's last task, the seed PRD should auto-flip from approved -> shipped. Builds the mechanism, wires it into the work CLI + state index + reviewer prompt + docs. scripts/work/prd-ship.mjs (new): - parseFrontmatter / serializeFrontmatter — minimal YAML-ish parser sufficient for PRD frontmatter (scalar + list shapes) - flipPrdStatus — pure function: takes PRD text, returns new text with status=shipped + shipped=<date> + optional shipping-commits. Refuses to flip draft, idempotent fail-soft on already-shipped, rejects unexpected statuses - deriveShippingCommits — best-effort git log of the linked epic folder for the --auto-commits flag - findPrdPath — id -> path lookup under docs/work/prds/ - runCli — wiring for `pnpm work prd-ship <id> [--commits\|--auto-commits]` scripts/work/prd-ship.test.mjs (new, 17 tests): - Frontmatter parser handles scalars + lists + missing frontmatter - flipPrdStatus covers all transitions + refusals + body/key preservation - findPrdPath + serializeFrontmatter coverage scripts/work/state-builder.mjs: - Epic entries gain a `prd` field - New computeNeedsPrdShip surfaces epics done with PRD status not yet shipped: state.needs_prd_ship[] with action commands scripts/work/cli.mjs: - New subcommand `pnpm work prd-ship <id>` .sandcastle/reviewer.prompt.md: - "Epic close-out: PRD status flip" section instructing reviewer to check _state.json.needs_prd_ship and run the suggested action - JSON output extends with prd_shipped: "<id>" \| null docs/work/README.md: - "PRD lifecycle" section documenting the 4 statuses + auto-flip Future PRDs follow the lifecycle automatically: decomposer refuses draft, human flips to approved, sandcastle ships the epic, reviewer runs prd-ship on the final task, PRD lands as shipped with its commit trail. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 16:51:48 +02:00
Danijel Martinek	4e1167e390	test(scripts): resolveClaudeAuth — subscription/api-key/missing modes	2026-05-13 09:30:11 +02:00
Danijel Martinek	936611ba62	feat(scripts): dispatch.mjs — subscription-first auth via ~/.claude mount	2026-05-13 09:28:20 +02:00
Danijel Martinek	d1b00f1cf5	feat(scripts): pnpm work dispatch — wire CLI to dispatch.mjs	2026-05-13 08:19:19 +02:00
Danijel Martinek	da811eb461	feat(scripts): dispatch.mjs — planner + execute-mode skeleton	2026-05-13 08:18:58 +02:00
Danijel Martinek	4cf979aaa5	feat(scripts): pnpm work ready + blocked subcommands, DAG-aware next	2026-05-13 08:05:19 +02:00
Danijel Martinek	23fedac1a8	feat(scripts): state-builder reads depends-on + blocks from frontmatter	2026-05-13 08:04:38 +02:00
Danijel Martinek	1ebffa68a6	feat(scripts): state-sync-guard for pre-commit safety net	2026-05-13 07:54:03 +02:00
Danijel Martinek	be8e89baed	feat(scripts): pnpm work CLI — rebuild-state, status, next	2026-05-13 07:46:51 +02:00
Danijel Martinek	6b57d76dc2	feat(scripts): work state-builder — walks docs/work/ tree	2026-05-13 07:46:28 +02:00

20 Commits