Two separate sandbox blockers surfaced when the user tried
`pnpm work decompose --execute`:
1. **Container died on exec** — our Dockerfile had:
- WORKDIR /workspace + CMD ["bash"]
- No `agent` user (sandcastle exec's as UID:GID it built with)
- node:22-bookworm-slim (missing some build deps the install
script wants)
Sandcastle expects:
- A non-root `agent` user with home at /home/agent (sandcastle
does `git config --global --add safe.directory /home/agent/workspace`,
which fails if the user doesn't exist or the container exited)
- ENTRYPOINT ["sleep", "infinity"] so the container survives
the gap between sandcastle creating it and exec'ing in
Replaced .sandcastle/Dockerfile with the shape `sandcastle init`
would generate (verified against
node_modules/@ai-hero/sandcastle/dist/InitService.js):
- node:22-bookworm (full, not slim) for build tooling
- apt-get installs git + curl + jq
- corepack-pinned pnpm@9
- ARG AGENT_UID=1000 + AGENT_GID=1000; sandcastle's
build-image passes the host's UID/GID by default
- `groupmod -o -g $AGENT_GID node` + `usermod -o ... node` —
the `-o` (non-unique) flag is required because macOS hosts
have UID:501 GID:20, and GID 20 collides with Debian's
`dialout` group in the base image (without -o, groupmod
fails with "GID '20' already exists")
- USER ${AGENT_UID}:${AGENT_GID}, then install Claude Code CLI
via the official installer
- ENV PATH includes /home/agent/.local/bin
- WORKDIR /home/agent (sandcastle overrides per-run anyway)
- ENTRYPOINT ["sleep", "infinity"] keeps the container alive
2. **"Not logged in · Please run /login"** inside the container —
Claude Code on macOS stores credentials in the Keychain, NOT in
~/.claude/.credentials.json. Sandcastle's bind-mount of ~/.claude
finds nothing usable. Documented the workaround:
- README.md "Sandcastle setup (one-time)" — macOS-specific
block with the `security find-generic-password ... > ~/.claude/.credentials.json`
one-liner + chmod 600 + the security trade-off (plaintext
file vs keychain isolation)
- docs/guides/runbook.md "Using Sandcastle → Prerequisites" —
step 3 (Authentication) gets a "macOS quirk" subsection with
the same extraction one-liner + the API-key fallback as the
alternative path
- scripts/work/{dispatch,decompose}.mjs — when the sandcastle
error matches /Not logged in|Please run \/login/ AND we're on
darwin, the dispatcher prints the keychain-extraction
commands + the API-key fallback inline above the generic
"See runbook" line, so future agents discover the fix at the
failure site
The image rebuilds clean (`pnpm exec sandcastle docker
build-image`) at ~1.95GB and the container survives sandcastle's
exec — confirmed by reaching the "Not logged in" stage (which is
the next-layer issue, not the Dockerfile issue).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
26 KiB
Developer Runbook
You just cloned this repo. This is the only doc you need to read end-to-end. Everything else is reference.
Prerequisites
| Tool | Version | Why |
|---|---|---|
| Node.js | 22+ | Runtime for all apps + scripts |
| pnpm | 9+ | Package manager (workspace-aware) |
| Docker | 24+ | Local Postgres + sandcastle sandboxes |
| Git | 2.40+ | Version control + worktrees |
Recommended editor: VS Code or Cursor with the official TypeScript, ESLint, and Prettier extensions.
First-time setup
# 1. Clone + install
git clone <repo-url> template-vertical
cd template-vertical
pnpm install
# 2. Start Postgres (background)
docker compose up -d
# 3. Copy env template and fill in secrets
cp .env.example .env
# Edit .env (see "Environment variables" section below for what each one does)
# 4. Verify the gate stack is green
pnpm typecheck
pnpm test
pnpm lint
pnpm conformance
pnpm fallow
pnpm turbo boundaries
All six should exit 0. If any fails on a fresh clone, file an issue — the main branch is supposed to stay green.
# 5. Start the dev servers
pnpm dev
This runs Next.js (3000), Payload CMS (3001), TanStack Start (3002), and Storybook (6006) in parallel. The bindAll() dispatcher in each app picks the dev-seed binders by default (mock repositories, no Payload connection needed beyond Postgres).
Daily commands
# Development
pnpm dev # all dev servers
pnpm dev --filter @repo/web-next # one app
# Tests
pnpm test # everything
pnpm test --filter @repo/auth # one package
pnpm test:e2e # Playwright e2e
pnpm test:stories # Storybook smoke tests
pnpm test:visual # visual regression (Playwright screenshots)
# Linting + type checking
pnpm typecheck # tsc across all packages
pnpm lint # ESLint across all packages
pnpm format # Prettier write
pnpm format:check # Prettier check (CI mode)
# Conformance gates
pnpm conformance # cross-feature event closure
pnpm fallow # whole-codebase: dead exports, dupes, complexity
pnpm fallow:audit # AI-change audit (run before commits)
# Boundary validation
pnpm turbo boundaries # workspace dependency graph
# Work system
pnpm work status # tree of epics + stories
pnpm work next # next ready story
pnpm work ready # all ready stories
pnpm work blocked # blocked stories + what they wait on
pnpm work rebuild-state # regenerate docs/work/_state.json
pnpm work dispatch # print next dispatch plan
pnpm work dispatch --execute # invoke sandcastle (subscription or API key — see runbook)
Environment variables
Copy .env.example to .env and fill what you need. NOT every variable is required for pnpm dev — defaults are dev-friendly.
Required for pnpm dev
| Var | Example | Why |
|---|---|---|
DATABASE_URL |
postgresql://postgres:postgres@localhost:5433/template |
Postgres connection (docker compose default) |
PAYLOAD_SECRET |
your-secret-here |
Payload CMS encryption key (any random 32+ char string in dev) |
Optional — app URLs (defaults work in dev)
| Var | Default | Why |
|---|---|---|
NEXT_PUBLIC_APP_URL |
http://localhost:3000 |
Public-facing web-next URL |
CMS_URL |
http://localhost:3001 |
Payload CMS URL |
USE_DEV_SEED |
true in dev |
Force dev-seed binders (mock repos) instead of Payload |
NODE_ENV |
inherited | production flips bind dispatcher to real Payload |
Optional — Sentry observability (no DSN = no-op tracer/logger)
| Var | Why |
|---|---|
WEB_NEXT_SENTRY_DSN |
Server-side OTel + Sentry for web-next |
NEXT_PUBLIC_WEB_NEXT_SENTRY_DSN |
Browser Sentry for web-next |
CMS_SENTRY_DSN |
Server-side for Payload CMS |
WEB_TANSTACK_SENTRY_DSN |
Server-side for TanStack Start |
VITE_WEB_TANSTACK_SENTRY_DSN |
Browser-side for TanStack Start |
SENTRY_AUTH_TOKEN, SENTRY_ORG, SENTRY_PROJECT_* |
Source-map upload at build time |
SENTRY_TRACES_SAMPLE_RATE |
OTel trace sample rate (0.1 recommended in dev) |
SENTRY_ENVIRONMENT |
development / staging / production |
Optional — Git commit SHA for releases
| Var | Why |
|---|---|
VERCEL_GIT_COMMIT_SHA / VITE_GIT_COMMIT_SHA / NEXT_PUBLIC_VERCEL_GIT_COMMIT_SHA |
Surfaces commit SHA in Sentry releases + UI footers |
Optional — core-audit (only when gen core-package audit is scaffolded)
| Var | Why |
|---|---|
AUDIT_PSEUDONYM_SALT |
Salt for the audit log's GDPR-erasure pseudonymisation (production only — must be a stable secret) |
Optional — sandcastle dispatch (only when running pnpm work dispatch --execute)
Auth is resolved automatically. Subscription (via ~/.claude/) is the primary path; API key is the fallback.
| Var | Why |
|---|---|
ANTHROPIC_API_KEY |
Claude API key — fallback when no ~/.claude/ present; not needed for subscribers |
OPENAI_API_KEY |
OpenAI/Codex alternative (fallback) |
SANDCASTLE_CLAUDE_CREDS_DIR |
Override host Claude creds path (default: ~/.claude/) |
GITHUB_TOKEN |
GitHub access for PR creation by the orchestrator |
SANDCASTLE_PROVIDER |
docker (default) / podman / vercel |
The agent-first workflow
This template enforces a manifest-first, generator-driven, gate-protected workflow.
When adding a new feature
pnpm turbo gen feature <name>
This emits:
packages/<name>/src/feature.manifest.ts— the conformance manifest (use cases, audits, publishes, consumes)packages/<name>/src/di/bind-production.tswithassertFeatureConformance(...)at the tail (refuses to boot on drift)- Mock repository, factory, seed, entity, use-case, controller, tests — full canonical shape
packages/<name>/src/index.tsexports
After scaffolding, the four-step ordering for any new use case:
- Manifest entry — declare the use case in
feature.manifest.ts - Contracts — export
xInputSchema,xOutputSchema,IXUseCase(factory body throwsnot implemented) - Tests (red) — write the failing test
- Implementation (green) — fill the factory body
The five conformance gates catch drift at every step. See docs/guides/conformance-quickref.md for the manifest field reference.
When adding cross-feature primitives
pnpm turbo gen event # event contract or handler (needs gen core-package events)
pnpm turbo gen job # background job
pnpm turbo gen realtime # realtime channel or handler (needs gen core-package realtime)
pnpm turbo gen core-package <x> # optional core package (events/realtime/trpc/ui/audit)
pnpm turbo gen core-ui-component <x> # atomic-design component (needs gen core-package ui)
Always prefer generators over hand-rolling. The generators emit the canonical shape; hand-rolled code drifts from generator output and breaks the CI scaffold-drift check.
Tracking work
The repo uses docs/work/ for epic/story/task tracking:
docs/work/
├── README.md
├── _state.json # derived, regenerated by pre-commit hook
├── prds/ # PRDs go here
├── _templates/ # markdown templates
└── <epic-slug>/
├── _epic.md
└── <story-slug>/
└── _story.md # contains the Tasks checklist
Use pnpm work next to see what's ready. Use pnpm work dispatch to plan the next sandcastle dispatch.
The five conformance gates
| Gate | Latency | What it catches | Runs when |
|---|---|---|---|
| TypeScript brands | 0s | forgotten withSpan / withCapture / withAudit; manifest ↔ binding-slot type mismatch |
on save (IDE) |
| ESLint (8 conformance/* rules) | <1s | manifest ↔ code drift; missing sibling test; missing manifest; atomic-tier import direction | on save / pnpm lint |
| Boot assertion | ~3s | runtime binding without required brand; manifest edited without rebinder | pnpm dev startup |
pnpm conformance |
~120s | orphan event consumers across features | CI |
pnpm fallow |
~30–60s | dead exports / unused files; duplicate code; circular deps; complexity hotspots; AI-change audit | CI |
For the full design see docs/architecture/agent-first-workflow-and-conformance.md. For the daily reference see docs/guides/conformance-quickref.md.
Using Sandcastle for agent dispatch
Sandcastle is the substrate that takes a markdown task description, hands it to a Claude / Codex agent running inside an isolated Docker sandbox, captures the agent's commits, and returns them so the orchestrator can route the diff to a reviewer agent. The repo's pnpm work dispatch wraps sandcastle for the manifest-first workflow.
When to use Sandcastle
- Routine, well-specified tasks — adding a behaviour slice to an existing use case, migrating a feature to a new convention, scaffolding new packages. The task description is the contract; sandcastle automates the rest.
- Parallel work — dispatch multiple independent tasks at once; each runs in its own sandbox branch.
- Reviewer-loop verification — the reviewer agent reads the diff against the task spec and either approves or sends feedback for another implementer pass.
When NOT to use Sandcastle
- Exploratory / design work — when the right answer isn't known, write it yourself. Sandcastle thrives when the task is "implement this", not "figure out what to do".
- Cross-cutting refactors — dispatch is per-task; many tasks that touch unrelated files at once is better done in one human-driven session.
- First-time integrations (e.g., adopting a new SDK) — better to walk through it manually, then capture the pattern as a generator for future sandcastle dispatches.
Prerequisites
-
Docker running — sandcastle uses Docker for the sandbox by default.
docker infoshould succeed. -
Sandcastle image built (one-time) — sandcastle dispatches into a tagged Docker image; you build it once per clone:
pnpm exec sandcastle docker build-image # Tags as: sandcastle:template-vertical (derived from root package.json name)If you see
Image 'sandcastle:template-vertical' not found locally. Build it first with 'sandcastle docker build-image'on dispatch, this step was skipped.To rebuild after editing
.sandcastle/Dockerfile:pnpm exec sandcastle docker remove-image pnpm exec sandcastle docker build-image -
Authentication — pick ONE:
-
Recommended: Claude Pro / Max subscription. Run
claude loginonce on the host. Sandcastle's sandbox bind-mounts your~/.claude/into the container so the Claude Code CLI inside the sandbox uses your subscription session. Zero per-task token spend for subscribers.macOS quirk: Claude Code stores credentials in the macOS Keychain, NOT in
~/.claude/.credentials.json— so the bind-mount finds nothing. If you hitNot logged in · Please run /logininside the sandbox, extract the keychain credentials to a file once:security find-generic-password -s "Claude Code-credentials" -a "$USER" -w \ > ~/.claude/.credentials.json chmod 600 ~/.claude/.credentials.jsonTrade-off: credentials now live as a plaintext file at the path; the macOS Keychain isolation is replaced by filesystem permissions (chmod 600 + your home dir's mode). When the token expires (~30 days), re-run the same one-liner. Linux + WSL hosts write
~/.claude/.credentials.jsondirectly duringclaude login, so this step is macOS-only. -
Alternative: API key. Set
ANTHROPIC_API_KEYorOPENAI_API_KEYin your environment. Falls back automatically when~/.claude/is absent. Use this if you don't want a plaintext credentials file on disk. -
Override the creds path via
SANDCASTLE_CLAUDE_CREDS_DIRif your Claude Code config lives somewhere non-standard.
-
-
GitHub token (optional) —
GITHUB_TOKENif you want the orchestrator to create PRs. -
.sandcastle/config present — already in tree:Dockerfile— node:22 + pnpm + Claude Code CLI; reads creds from~/.claude/inside the containerprd-eliciter.prompt.md,adr-eliciter.prompt.md,decomposer.prompt.md,implementer.prompt.md,reviewer.prompt.md— the five role prompts
The dispatch flow
pnpm work next → identifies the next ready story (DAG-aware)
pnpm work dispatch → prints what WOULD be dispatched (no Sandcastle call)
pnpm work dispatch --execute
→ invokes sandcastle.run(implementer prompt + task spec)
→ sandcastle returns { branch, commits, stdout, ... }
→ orchestrator computes `git diff main..<branch>`
→ invokes sandcastle.run(reviewer prompt + diff)
→ reviewer returns approve / reject + notes
→ orchestrator prints suggested state mutation
(in v1: human ticks the bullet + commits manually)
Worked example — dispatch a real task
Suppose pnpm work next reports:
auth-v1 / 02-sign-up — Sign up with email and password
status: in-progress, tasks: 3/7
The story file docs/work/auth-v1/02-sign-up/_story.md has a Tasks list with the next unchecked bullet:
- [ ] Hash password using injected IPasswordHasher before persisting
Step 1 — Plan
pnpm work dispatch
Output:
=== Dispatch plan ===
Epic: auth-v1
Story: 02-sign-up — Sign up with email and password
Bullet: - [ ] Hash password using injected IPasswordHasher before persisting
Prompt: .sandcastle/implementer.prompt.md
To execute this dispatch, run:
ANTHROPIC_API_KEY=... pnpm work dispatch --execute
This is safe to run anywhere — it never invokes Sandcastle.
Step 2 — Execute
# Subscription mode (recommended):
claude login # one-time, host
pnpm work dispatch --execute # uses ~/.claude/
# API-key mode (fallback):
ANTHROPIC_API_KEY=sk-ant-... pnpm work dispatch --execute
The orchestrator:
- Builds the task spec (story metadata + the current bullet + full story context)
- Calls
sandcastle.run({ promptFile: ".sandcastle/implementer.prompt.md", promptArgs: { TASK_FILE_CONTENT: spec }, ... }) - Sandcastle pulls the Docker image, mounts the repo into
/workspace, runsclaudeCodewith the implementer prompt template populated - The implementer agent (inside the sandbox):
- Reads the task spec
- Runs
pnpm install --frozen-lockfile - Locates the use case:
packages/auth/src/application/use-cases/sign-up.use-case.ts - Writes a red test asserting
hasher.hashis called beforerepo.create - Runs
pnpm test --filter @repo/auth— sees the red test fail - Adds
IPasswordHasherto the factory deps; callshasher.hash(input.password)beforerepo.create - Runs
pnpm test --filter @repo/auth— green - Runs
pnpm typecheck,pnpm lint,pnpm conformance,pnpm fallow:audit— all five gates green - Commits on a sandbox branch (
task/02-sign-up-hash-passwordor similar)
- Sandcastle returns:
{ branch: "task/02-sign-up-hash-password", commits: [{sha: "..."}], stdout: "...", ... }
Step 3 — Review
The orchestrator immediately runs the reviewer:
- Computes
git diff main..task/02-sign-up-hash-password - Calls
sandcastle.run({ promptFile: ".sandcastle/reviewer.prompt.md", promptArgs: { TASK_FILE_CONTENT: spec, DIFF: diff }, ... }) - The reviewer agent reads the diff + task + story; verifies:
- The AC bullet is satisfied (test was added; impl calls
hasher.hash) - Nothing in the "Out of scope" section was touched (no drive-by edits)
- All gates were run
- The implementer ran
pnpm fallow:audit - Generator-first was respected (no hand-rolled scaffolding)
- The AC bullet is satisfied (test was added; impl calls
- Returns
{ decision: "approve", ac_verified: [4], scope_violations: [], notes: "..." }
Step 4 — State mutation (v1: manual)
The orchestrator prints:
=== Suggested state mutation ===
Edit docs/work/auth-v1/02-sign-up/_story.md — tick the bullet:
- [x] Hash password using injected IPasswordHasher before persisting
Then: pnpm work rebuild-state && git add -A && git commit -m "..."
(Automatic state mutation by the orchestrator is v2.)
You (the human) then:
- Merge the sandbox branch:
git merge --no-ff task/02-sign-up-hash-password - Tick the bullet in the story markdown
- The pre-commit hook auto-runs
pnpm work rebuild-state+ re-stages_state.json - Push. CI runs the full gate stack (typecheck + test + lint + conformance + fallow + boundaries + visual regression).
Troubleshooting Sandcastle
✗ --execute requires either: 1. Claude Code logged in on host ... 2. ANTHROPIC_API_KEY ...
— No auth resolved. Run claude login to enable subscription mode (recommended), OR set ANTHROPIC_API_KEY (fallback). Override the host creds path via SANDCASTLE_CLAUDE_CREDS_DIR.
Error: Cannot find module '@ai-hero/sandcastle'
— Run pnpm install. Sandcastle is a dev dependency at the workspace root.
Error: docker: command not found or sandcastle hangs at "starting sandbox"
— Docker isn't running. docker info to confirm. On macOS, start Docker Desktop.
The implementer agent times out
— Default idleTimeoutSeconds is 600 (10 minutes). For complex tasks, increase via dispatch.mjs (look for the run({...}) call and add idleTimeoutSeconds: 1800).
The reviewer rejects with generator_skipped: true
— The implementer hand-rolled what should have been generator output. Either re-dispatch (it gets the reviewer notes), or delete the implementer's diff and run pnpm turbo gen <kind> manually first, then dispatch the customisation as a separate task.
The reviewer rejects with scope_violations: [...]
— The implementer touched files outside the AC. Re-dispatch with stricter scope; the rejection notes are passed back as context.
Cost control — each dispatch typically uses 50K–200K agent tokens depending on task complexity. The orchestrator does NOT cap retries; if you want to limit, set max-attempts: 1 in the task's frontmatter (the orchestrator respects this in v2 — for now, just don't re-run dispatch after a reject).
Sandbox boots but Claude Code inside it says "Not authenticated" / "API key required"
— The host ~/.claude/ mount didn't make it into the sandbox, OR your local Claude Code session expired. On the host, run claude once to confirm your session is live, then re-dispatch. If you're on Linux + SELinux, the mount may have been blocked — check the sandcastle output for SELinux warnings; set selinuxLabel: "z" or false in dispatch.mjs's docker opts if needed.
Cost-aware variant: planning-only loop
If you want sandcastle's structure without the agent spend, use planning mode + manual execution:
pnpm work dispatch # prints the plan
# (you implement the bullet manually in your editor)
# tick the bullet in docs/work/.../...story.md
# commit; pre-commit auto-rebuilds _state.json
pnpm work dispatch # prints the NEXT plan
This gives you the same DAG-aware "what's next?" without invoking any agent. Useful for exploratory work or low-budget contexts.
Troubleshooting
pnpm dev refuses to boot with ConformanceError
— A feature's binding lost a required brand. The error message tells you which use case + which brand. Re-bind through withSpan / withCapture / withAudit as needed.
pnpm lint errors with conformance/feature-must-have-manifest
— You created a feature with use cases but no feature.manifest.ts. Run pnpm turbo gen feature <name> to scaffold the canonical shape, or hand-write the manifest at packages/<feature>/src/feature.manifest.ts.
pnpm conformance says "orphan consumer"
— A feature declares consumes: ["X"] but no feature publishes X. Either add the publish to the producing feature's manifest + factory, or remove the consumer.
pnpm fallow reports new dead exports or dupes
— Your change added unused exports or duplicated logic. Either remove the dead code or accept with pnpm fallow:audit --gate all (the audit considers the baseline; only NEW findings fail).
Pre-commit hook refuses to commit with "state-sync-guard"
— You staged docs/work/_state.json but it's not byte-identical to pnpm work rebuild-state output. Run pnpm work rebuild-state && git add docs/work/_state.json and try again.
Tests fail in @repo/turbo-generators with Vitest worker timeouts
— Known flaky on slow machines. Re-run; if persistent, increase the turbo-generators package's vitest testTimeout.
pnpm work dispatch --execute errors with "requires either: 1. Claude Code logged in..."
— No auth source found. Run claude login (subscription mode, recommended), or set ANTHROPIC_API_KEY (fallback). Run pnpm work dispatch (no flag) to just print the plan without auth.
Where to read next
Once you've got pnpm dev running:
AGENTS.md— package map, boundary rules, per-package conventionsCLAUDE.md— full convention reference (manifest-first ordering, factory patterns, instrumentation rules)docs/guides/conformance-quickref.md— daily manifest + gates referencedocs/guides/tdd-workflow.md— red-green-refactor with the gate stackdocs/guides/scaffolding-a-feature.md—pnpm turbo gen featurereferencedocs/guides/adding-a-feature.md— end-to-end walkthroughdocs/architecture/agent-first-workflow-and-conformance.md— the full designdocs/architecture/feature-conformance-explainer.html— interactive explainer (open in browser)
For deeper topics:
docs/guides/events-and-jobs.md— cross-feature events (requiresgen core-package events)docs/guides/realtime.md— Socket.IO channels (requiresgen core-package realtime)docs/guides/audit-and-compliance.md— DPA-compliant audit logging (requiresgen core-package audit)docs/guides/frontend-work-shape.md— atomic design + Storybook conventionsdocs/guides/infrastructure-work-shape.md— ADR-first flow for new infrastructure
Common pitfalls
- Skipping the generator. Always run
pnpm turbo gen <kind>before hand-rolling. Generators emit the canonical shape; the CI scaffold-drift check will fail on hand-rolled features. - Forgetting
pnpm work rebuild-stateafter editingdocs/work/markdown. The pre-commit hook handles this automatically when you stage markdown; only matters if you push without committing. - Bypassing
--no-verifyon commits. The pre-commit hook catches drift early. If it's blocking a legitimate change, fix the underlying issue, not the hook. - Hand-editing
_state.json. Don't. The state-sync-guard refuses commits that drift from rebuild output. Edit the markdown; let the rebuild script propagate. - Committing
.env. It's gitignored. Use.env.examplefor new vars.
For deeper philosophy: this template is built around the assumption that AI agents will author most feature work. The conformance system is designed as an agent feedback loop. Latency-layered gates compound: 0s + <1s + 3s + 120s + 60s. The faster the inner loop, the more iterations agents can make per task.
If you're a human contributor, the same workflow applies — the gates aren't punitive, they're navigational aids.