agentic-dev

Template

Author	SHA1	Message	Date
Danijel Martinek	9e7723f9a5	fix(scripts): remove broken session-resume from dispatch loop Some checks failed CI / typecheck + lint + boundaries + test + build (push) Has been cancelled Details CI / Playwright e2e (push) Has been cancelled Details CI / Storybook smoke tests + visual regression (push) Has been cancelled Details Coverage snapshot / snapshot (push) Has been cancelled Details Release Please / release-please (push) Has been cancelled Details Sentry PII guard (R31) / pii-guard (push) Has been cancelled Details Sandcastle rejects `resumeSession` when `maxIterations > 1` with "Resume applies to iteration 1 only; multi-iteration resume semantics are not supported." Since a TDD slice needs the full 30-iteration budget, the session-resume path we shipped in `d5c0120` is dead infrastructure that breaks dispatch mid-run. Rip it out cleanly: - runOneSlice drops the resumeSession param + the context-exhaustion safety net + sessionId/usage return fields - executeDispatch drops the currentStory/currentSession bookkeeping and the token-reset threshold - helpers totalInputTokens + isContextExhaustedError go (only used by the resume path) - SANDCASTLE_SESSION_TOKEN_RESET removed from .env.example Net: -153 lines. Each slice is again an independent sandcastle session; token cost per slice goes up (each implementer re-discovers context) but the multi-iteration TDD shape works. A different cross-slice context-passing mechanism (e.g. a story-level context summary injected into each task spec) is left as future work.	2026-05-14 11:48:32 +02:00
Danijel Martinek	d5c01209ea	feat(work): resume implementer session across same-story slices Wires sandcastle's native `resumeSession` into the dispatch loop so the implementer walks into task N already knowing what task N-1 discovered — repo layout, helper signatures, gate output, prior diff. No scratchpad / no hand-curated context file; the agent's own Claude Code conversation log is the carrier. Three guardrails keep it bounded: - Story boundary reset. `currentSession` is dropped whenever findNextTask returns a different story id. New domain ≈ new context — keeps story 03 from inheriting story 02's residue. - Token-threshold reset. After each approved slice, sum the implementer's last-iteration usage (inputTokens + cacheCreationInputTokens + cacheReadInputTokens — caching saves dollars but doesn't free window space). If above SANDCASTLE_SESSION_TOKEN_RESET (default 140000 ≈ 70% of Sonnet 4.6's 200k), drop the session before the next task. Configurable via env. - Context-exhausted safety net. If the model rejects with "prompt is too long" / "context_length_exceeded" / similar, the retry loop drops the session and re-runs the attempt fresh exactly once. Doesn't count against SANDCASTLE_MAX_ATTEMPTS (different failure mode). Reviewer always runs fresh — each approve/reject decision should be independent of prior tasks to keep the gate honest. Within a single slice's reject-fixup retries, the implementer also carries forward across attempts (so attempt 2 sees attempt 1's reasoning + the reviewer notes), but that's per-slice cumulative, not cross-slice. runOneSlice now returns { sessionId, usage } so executeDispatch can make the carry-or-reset decision per slice.	2026-05-13 20:13:30 +02:00
Danijel Martinek	edbc6a8fad	feat(work): dispatch loops + auto-ticks state on approve Previously the orchestrator ran exactly one implementer + reviewer pair, printed "(Automatic state mutation by the orchestrator is v2.)", and exited — the human had to tick the bullet, flip story status, rebuild state, and re-invoke for every slice. V2 closes the loop: - Parses the JSON the implementer + reviewer prompts ask the agents to emit (`parseAgentJson` — tolerates both ```json fenced and bare trailing { ... } shapes). The reviewer's `decision` and the implementer's `status` are the orchestrator's discriminators. - On approve: ticks the bullet in `_story.md` and writes it back. If the story now has zero unchecked bullets, flips its frontmatter `status: in-progress → done`; if all sibling stories are also done, flips the epic's frontmatter the same way. Commits the mutation on the host as a separate `chore(work): tick/finish ...` commit so the implementer's slice commit stays clean. `_state.json` regenerates via the existing pre-commit `rebuild-state` hook. - On reject: re-dispatches the implementer with the reviewer's notes appended to TASK_FILE_CONTENT, bounded by SANDCASTLE_MAX_ATTEMPTS (default 3). On the (max+1)th reject the loop exits 1 with the last notes printed. - After every approved slice, calls findNextTask again and dispatches the next ready bullet — including across story boundaries (the state-builder treats any non-done story with satisfied deps as ready, so flipping story 01 to done unblocks story 02 automatically). - Flags: `--once` (legacy single-slice behavior) and `--max-tasks N` bound the loop. Default is unlimited — matches the continuous-execution preference. Auth/sandbox setup is now pulled out of the per-iteration path so the loop reuses one sandbox across slices.	2026-05-13 19:43:11 +02:00
Danijel Martinek	d6bf2f638f	docs(env): document sandcastle iteration env vars in .env.example The three SANDCASTLE_*_ITERATIONS overrides landed inline in decompose.mjs and dispatch.mjs (commit `26aa97f`) but weren't surfaced in .env.example. Adds them with the same tuning guidance the inline comments carry, so users discover the knobs from the canonical env reference instead of having to grep the code.	2026-05-13 18:57:50 +02:00
Danijel Martinek	e734a9e7a1	docs: subscription auth is the primary sandcastle flow, API key is fallback	2026-05-13 09:31:33 +02:00
Danijel Martinek	88edde342e	docs: README + CLAUDE.md + .env.example + quickref reflect latest gates	2026-05-13 09:05:10 +02:00
Danijel Martinek	677a45b52f	fix: use port 5433 for Docker PostgreSQL to avoid conflicts with local instance	2026-04-06 15:37:49 +02:00
Danijel Martinek	6cff55d6d3	feat: scaffold root workspace files (Turborepo + pnpm)	2026-04-06 14:04:41 +02:00

8 Commits