feat(work): resume implementer session across same-story slices
Wires sandcastle's native `resumeSession` into the dispatch loop so
the implementer walks into task N already knowing what task N-1
discovered — repo layout, helper signatures, gate output, prior diff.
No scratchpad / no hand-curated context file; the agent's own Claude
Code conversation log is the carrier.
Three guardrails keep it bounded:
- Story boundary reset. `currentSession` is dropped whenever
findNextTask returns a different story id. New domain ≈ new
context — keeps story 03 from inheriting story 02's residue.
- Token-threshold reset. After each approved slice, sum the
implementer's last-iteration usage (inputTokens +
cacheCreationInputTokens + cacheReadInputTokens — caching saves
dollars but doesn't free window space). If above
SANDCASTLE_SESSION_TOKEN_RESET (default 140000 ≈ 70% of Sonnet
4.6's 200k), drop the session before the next task. Configurable
via env.
- Context-exhausted safety net. If the model rejects with
"prompt is too long" / "context_length_exceeded" / similar, the
retry loop drops the session and re-runs the attempt fresh
exactly once. Doesn't count against SANDCASTLE_MAX_ATTEMPTS
(different failure mode).
Reviewer always runs fresh — each approve/reject decision should be
independent of prior tasks to keep the gate honest. Within a single
slice's reject-fixup retries, the implementer also carries forward
across attempts (so attempt 2 sees attempt 1's reasoning + the
reviewer notes), but that's per-slice cumulative, not cross-slice.
runOneSlice now returns { sessionId, usage } so executeDispatch can
make the carry-or-reset decision per slice.
This commit is contained in:
11
.env.example
11
.env.example
@@ -88,3 +88,14 @@ CMS_URL=http://localhost:3001
|
||||
# notes printed. Bump for tricky slices; lower for fast-feedback iteration.
|
||||
#
|
||||
# SANDCASTLE_MAX_ATTEMPTS=3
|
||||
|
||||
# Session-resume token threshold. The orchestrator passes the prior
|
||||
# implementer's session ID into the next slice's run() via sandcastle's
|
||||
# `resumeSession` — the agent walks into task 2 already knowing where
|
||||
# helpers live, what the prior diff looked like, which gates passed.
|
||||
# When the prior iteration's total input tokens (input + cacheRead +
|
||||
# cacheCreation) crosses this threshold the orchestrator drops the
|
||||
# session and starts the next task fresh, avoiding mid-slice context
|
||||
# exhaustion. Default 140000 ≈ 70% of Sonnet 4.6's 200k window.
|
||||
#
|
||||
# SANDCASTLE_SESSION_TOKEN_RESET=140000
|
||||
|
||||
Reference in New Issue
Block a user