Files

Danijel Martinek 039079b64a docs(guides): runbook section on using Sandcastle for agent dispatch

Adds a "Using Sandcastle for agent dispatch" section between the gate
table and Troubleshooting. Covers when to use / not use sandcastle,
prerequisites (Docker + agent API key + .sandcastle/ config), the
dispatch flow, a worked end-to-end example (plan → execute → review →
manual state mutation), troubleshooting (env vars, Docker, timeouts,
rejection modes), and a cost-aware planning-only variant.

2026-05-13 09:09:56 +02:00

23 KiB

Raw Blame History

Developer Runbook

You just cloned this repo. This is the only doc you need to read end-to-end. Everything else is reference.

Prerequisites

Tool	Version	Why
Node.js	22+	Runtime for all apps + scripts
pnpm	9+	Package manager (workspace-aware)
Docker	24+	Local Postgres + sandcastle sandboxes
Git	2.40+	Version control + worktrees

Recommended editor: VS Code or Cursor with the official TypeScript, ESLint, and Prettier extensions.

First-time setup

# 1. Clone + install
git clone <repo-url> template-vertical
cd template-vertical
pnpm install

# 2. Start Postgres (background)
docker compose up -d

# 3. Copy env template and fill in secrets
cp .env.example .env
# Edit .env (see "Environment variables" section below for what each one does)

# 4. Verify the gate stack is green
pnpm typecheck
pnpm test
pnpm lint
pnpm conformance
pnpm fallow
pnpm turbo boundaries

All six should exit 0. If any fails on a fresh clone, file an issue — the main branch is supposed to stay green.

# 5. Start the dev servers
pnpm dev

This runs Next.js (3000), Payload CMS (3001), TanStack Start (3002), and Storybook (6006) in parallel. The bindAll() dispatcher in each app picks the dev-seed binders by default (mock repositories, no Payload connection needed beyond Postgres).

Daily commands

# Development
pnpm dev                          # all dev servers
pnpm dev --filter @repo/web-next  # one app

# Tests
pnpm test                         # everything
pnpm test --filter @repo/auth     # one package
pnpm test:e2e                     # Playwright e2e
pnpm test:stories                 # Storybook smoke tests
pnpm test:visual                  # visual regression (Playwright screenshots)

# Linting + type checking
pnpm typecheck                    # tsc across all packages
pnpm lint                         # ESLint across all packages
pnpm format                       # Prettier write
pnpm format:check                 # Prettier check (CI mode)

# Conformance gates
pnpm conformance                  # cross-feature event closure
pnpm fallow                       # whole-codebase: dead exports, dupes, complexity
pnpm fallow:audit                 # AI-change audit (run before commits)

# Boundary validation
pnpm turbo boundaries             # workspace dependency graph

# Work system
pnpm work status                  # tree of epics + stories
pnpm work next                    # next ready story
pnpm work ready                   # all ready stories
pnpm work blocked                 # blocked stories + what they wait on
pnpm work rebuild-state           # regenerate docs/work/_state.json
pnpm work dispatch                # print next dispatch plan
pnpm work dispatch --execute      # invoke sandcastle (requires ANTHROPIC_API_KEY)

Environment variables

Copy .env.example to .env and fill what you need. NOT every variable is required for pnpm dev — defaults are dev-friendly.

Required for `pnpm dev`

Var	Example	Why
`DATABASE_URL`	`postgresql://postgres:postgres@localhost:5433/template`	Postgres connection (docker compose default)
`PAYLOAD_SECRET`	`your-secret-here`	Payload CMS encryption key (any random 32+ char string in dev)

Optional — app URLs (defaults work in dev)

Var	Default	Why
`NEXT_PUBLIC_APP_URL`	`http://localhost:3000`	Public-facing web-next URL
`CMS_URL`	`http://localhost:3001`	Payload CMS URL
`USE_DEV_SEED`	`true` in dev	Force dev-seed binders (mock repos) instead of Payload
`NODE_ENV`	inherited	`production` flips bind dispatcher to real Payload

Optional — Sentry observability (no DSN = no-op tracer/logger)

Var	Why
`WEB_NEXT_SENTRY_DSN`	Server-side OTel + Sentry for web-next
`NEXT_PUBLIC_WEB_NEXT_SENTRY_DSN`	Browser Sentry for web-next
`CMS_SENTRY_DSN`	Server-side for Payload CMS
`WEB_TANSTACK_SENTRY_DSN`	Server-side for TanStack Start
`VITE_WEB_TANSTACK_SENTRY_DSN`	Browser-side for TanStack Start
`SENTRY_AUTH_TOKEN`, `SENTRY_ORG`, `SENTRY_PROJECT_*`	Source-map upload at build time
`SENTRY_TRACES_SAMPLE_RATE`	OTel trace sample rate (`0.1` recommended in dev)
`SENTRY_ENVIRONMENT`	`development` / `staging` / `production`

Optional — Git commit SHA for releases

Var	Why
`VERCEL_GIT_COMMIT_SHA` / `VITE_GIT_COMMIT_SHA` / `NEXT_PUBLIC_VERCEL_GIT_COMMIT_SHA`	Surfaces commit SHA in Sentry releases + UI footers

Optional — core-audit (only when `gen core-package audit` is scaffolded)

Var	Why
`AUDIT_PSEUDONYM_SALT`	Salt for the audit log's GDPR-erasure pseudonymisation (production only — must be a stable secret)

Optional — sandcastle dispatch (only when running `pnpm work dispatch --execute`)

Var	Why
`ANTHROPIC_API_KEY`	Claude API key (sandcastle's default agent)
`OPENAI_API_KEY`	OpenAI/Codex alternative
`GITHUB_TOKEN`	GitHub access for PR creation by the orchestrator
`SANDCASTLE_PROVIDER`	`docker` (default) / `podman` / `vercel`

The agent-first workflow

This template enforces a manifest-first, generator-driven, gate-protected workflow.

When adding a new feature

pnpm turbo gen feature <name>

This emits:

packages/<name>/src/feature.manifest.ts — the conformance manifest (use cases, audits, publishes, consumes)
packages/<name>/src/di/bind-production.ts with assertFeatureConformance(...) at the tail (refuses to boot on drift)
Mock repository, factory, seed, entity, use-case, controller, tests — full Lazar-conformant shape
packages/<name>/src/index.ts exports

After scaffolding, the four-step ordering for any new use case:

Manifest entry — declare the use case in feature.manifest.ts
Contracts — export xInputSchema, xOutputSchema, IXUseCase (factory body throws not implemented)
Tests (red) — write the failing test
Implementation (green) — fill the factory body

The five conformance gates catch drift at every step. See docs/guides/conformance-quickref.md for the manifest field reference.

When adding cross-feature primitives

pnpm turbo gen event             # event contract or handler (needs gen core-package events)
pnpm turbo gen job               # background job
pnpm turbo gen realtime          # realtime channel or handler (needs gen core-package realtime)
pnpm turbo gen core-package <x>  # optional core package (events/realtime/trpc/ui/audit)
pnpm turbo gen core-ui-component <x>  # atomic-design component (needs gen core-package ui)

Always prefer generators over hand-rolling. The generators emit the canonical shape; hand-rolled code drifts from generator output and breaks the CI scaffold-drift check.

Tracking work

The repo uses docs/work/ for epic/story/task tracking:

docs/work/
├── README.md
├── _state.json              # derived, regenerated by pre-commit hook
├── prds/                    # PRDs go here
├── _templates/              # markdown templates
└── <epic-slug>/
    ├── _epic.md
    └── <story-slug>/
        └── _story.md        # contains the Tasks checklist

Use pnpm work next to see what's ready. Use pnpm work dispatch to plan the next sandcastle dispatch.

The five conformance gates

Gate	Latency	What it catches	Runs when
TypeScript brands	0s	forgotten `withSpan` / `withCapture` / `withAudit`; manifest ↔ binding-slot type mismatch	on save (IDE)
ESLint (8 conformance/* rules)	<1s	manifest ↔ code drift; missing sibling test; missing manifest; atomic-tier import direction	on save / `pnpm lint`
Boot assertion	~3s	runtime binding without required brand; manifest edited without rebinder	`pnpm dev` startup
`pnpm conformance`	~120s	orphan event consumers across features	CI
`pnpm fallow`	~30–60s	dead exports / unused files; duplicate code; circular deps; complexity hotspots; AI-change audit	CI

For the full design see docs/architecture/agent-first-workflow-and-conformance.md. For the daily reference see docs/guides/conformance-quickref.md.

Using Sandcastle for agent dispatch

Sandcastle is the substrate that takes a markdown task description, hands it to a Claude / Codex agent running inside an isolated Docker sandbox, captures the agent's commits, and returns them so the orchestrator can route the diff to a reviewer agent. The repo's pnpm work dispatch wraps sandcastle for the manifest-first workflow.

When to use Sandcastle

Routine, well-specified tasks — adding a behaviour slice to an existing use case, migrating a feature to a new convention, scaffolding new packages. The task description is the contract; sandcastle automates the rest.
Parallel work — dispatch multiple independent tasks at once; each runs in its own sandbox branch.
Reviewer-loop verification — the reviewer agent reads the diff against the task spec and either approves or sends feedback for another implementer pass.

When NOT to use Sandcastle

Exploratory / design work — when the right answer isn't known, write it yourself. Sandcastle thrives when the task is "implement this", not "figure out what to do".
Cross-cutting refactors — dispatch is per-task; many tasks that touch unrelated files at once is better done in one human-driven session.
First-time integrations (e.g., adopting a new SDK) — better to walk through it manually, then capture the pattern as a generator for future sandcastle dispatches.

Prerequisites

Docker running — sandcastle uses Docker for the sandbox by default. docker info should succeed.
Agent API key — set ONE of:
- ANTHROPIC_API_KEY (recommended; sandcastle's default agent is claudeCode)
- OPENAI_API_KEY (alternative)
GitHub token (optional) — GITHUB_TOKEN if you want the orchestrator to create PRs.
.sandcastle/ config present — already in tree:
- Dockerfile — node:22 + pnpm sandbox image
- prd-eliciter.prompt.md — interviews humans to draft PRDs
- adr-eliciter.prompt.md — same shape, for infrastructure decisions
- decomposer.prompt.md — PRD → epic + stories with generator-first task lists
- implementer.prompt.md — executes one task; runs all 5 gates before committing
- reviewer.prompt.md — reviews implementer's diff against AC + scope

The dispatch flow

pnpm work next           → identifies the next ready story (DAG-aware)
pnpm work dispatch       → prints what WOULD be dispatched (no Sandcastle call)
pnpm work dispatch --execute
                         → invokes sandcastle.run(implementer prompt + task spec)
                         → sandcastle returns { branch, commits, stdout, ... }
                         → orchestrator computes `git diff main..<branch>`
                         → invokes sandcastle.run(reviewer prompt + diff)
                         → reviewer returns approve / reject + notes
                         → orchestrator prints suggested state mutation
                              (in v1: human ticks the bullet + commits manually)

Worked example — dispatch a real task

Suppose pnpm work next reports:

auth-v1 / 02-sign-up — Sign up with email and password
  status: in-progress, tasks: 3/7

The story file docs/work/auth-v1/02-sign-up/_story.md has a Tasks list with the next unchecked bullet:

- [ ] Hash password using injected IPasswordHasher before persisting

Step 1 — Plan

pnpm work dispatch

Output:

=== Dispatch plan ===
  Epic:     auth-v1
  Story:    02-sign-up — Sign up with email and password
  Bullet:   - [ ] Hash password using injected IPasswordHasher before persisting
  Prompt:   .sandcastle/implementer.prompt.md

To execute this dispatch, run:
  ANTHROPIC_API_KEY=... pnpm work dispatch --execute

This is safe to run anywhere — it never invokes Sandcastle.

Step 2 — Execute

ANTHROPIC_API_KEY=sk-ant-... pnpm work dispatch --execute

The orchestrator:

Builds the task spec (story metadata + the current bullet + full story context)
Calls sandcastle.run({ promptFile: ".sandcastle/implementer.prompt.md", promptArgs: { TASK_FILE_CONTENT: spec }, ... })
Sandcastle pulls the Docker image, mounts the repo into /workspace, runs claudeCode with the implementer prompt template populated
The implementer agent (inside the sandbox):
- Reads the task spec
- Runs pnpm install --frozen-lockfile
- Locates the use case: packages/auth/src/application/use-cases/sign-up.use-case.ts
- Writes a red test asserting hasher.hash is called before repo.create
- Runs pnpm test --filter @repo/auth — sees the red test fail
- Adds IPasswordHasher to the factory deps; calls hasher.hash(input.password) before repo.create
- Runs pnpm test --filter @repo/auth — green
- Runs pnpm typecheck, pnpm lint, pnpm conformance, pnpm fallow:audit — all five gates green
- Commits on a sandbox branch (task/02-sign-up-hash-password or similar)
Sandcastle returns: { branch: "task/02-sign-up-hash-password", commits: [{sha: "..."}], stdout: "...", ... }

Step 3 — Review

The orchestrator immediately runs the reviewer:

Computes git diff main..task/02-sign-up-hash-password
Calls sandcastle.run({ promptFile: ".sandcastle/reviewer.prompt.md", promptArgs: { TASK_FILE_CONTENT: spec, DIFF: diff }, ... })
The reviewer agent reads the diff + task + story; verifies:
- The AC bullet is satisfied (test was added; impl calls hasher.hash)
- Nothing in the "Out of scope" section was touched (no drive-by edits)
- All gates were run
- The implementer ran pnpm fallow:audit
- Generator-first was respected (no hand-rolled scaffolding)
Returns { decision: "approve", ac_verified: [4], scope_violations: [], notes: "..." }

Step 4 — State mutation (v1: manual)

The orchestrator prints:

=== Suggested state mutation ===
  Edit docs/work/auth-v1/02-sign-up/_story.md — tick the bullet:
    - [x] Hash password using injected IPasswordHasher before persisting
  Then: pnpm work rebuild-state && git add -A && git commit -m "..."

(Automatic state mutation by the orchestrator is v2.)

You (the human) then:

Merge the sandbox branch: git merge --no-ff task/02-sign-up-hash-password
Tick the bullet in the story markdown
The pre-commit hook auto-runs pnpm work rebuild-state + re-stages _state.json
Push. CI runs the full gate stack (typecheck + test + lint + conformance + fallow + boundaries + visual regression).

Troubleshooting Sandcastle

✗ --execute requires ANTHROPIC_API_KEY or OPENAI_API_KEY in env. — Set one. The default agent is Claude.

Error: Cannot find module '@ai-hero/sandcastle' — Run pnpm install. Sandcastle is a dev dependency at the workspace root.

Error: docker: command not found or sandcastle hangs at "starting sandbox" — Docker isn't running. docker info to confirm. On macOS, start Docker Desktop.

The implementer agent times out — Default idleTimeoutSeconds is 600 (10 minutes). For complex tasks, increase via dispatch.mjs (look for the run({...}) call and add idleTimeoutSeconds: 1800).

The reviewer rejects with generator_skipped: true — The implementer hand-rolled what should have been generator output. Either re-dispatch (it gets the reviewer notes), or delete the implementer's diff and run pnpm turbo gen <kind> manually first, then dispatch the customisation as a separate task.

The reviewer rejects with scope_violations: [...] — The implementer touched files outside the AC. Re-dispatch with stricter scope; the rejection notes are passed back as context.

Cost control — each dispatch typically uses 50K–200K agent tokens depending on task complexity. The orchestrator does NOT cap retries; if you want to limit, set max-attempts: 1 in the task's frontmatter (the orchestrator respects this in v2 — for now, just don't re-run dispatch after a reject).

Cost-aware variant: planning-only loop

If you want sandcastle's structure without the agent spend, use planning mode + manual execution:

pnpm work dispatch              # prints the plan
# (you implement the bullet manually in your editor)
# tick the bullet in docs/work/.../...story.md
# commit; pre-commit auto-rebuilds _state.json
pnpm work dispatch              # prints the NEXT plan

This gives you the same DAG-aware "what's next?" without invoking any agent. Useful for exploratory work or low-budget contexts.

Troubleshooting

pnpm dev refuses to boot with ConformanceError — A feature's binding lost a required brand. The error message tells you which use case + which brand. Re-bind through withSpan / withCapture / withAudit as needed.

pnpm lint errors with conformance/feature-must-have-manifest — You created a feature with use cases but no feature.manifest.ts. Run pnpm turbo gen feature <name> to scaffold the canonical shape, or hand-write the manifest at packages/<feature>/src/feature.manifest.ts.

pnpm conformance says "orphan consumer" — A feature declares consumes: ["X"] but no feature publishes X. Either add the publish to the producing feature's manifest + factory, or remove the consumer.

pnpm fallow reports new dead exports or dupes — Your change added unused exports or duplicated logic. Either remove the dead code or accept with pnpm fallow:audit --gate all (the audit considers the baseline; only NEW findings fail).

Pre-commit hook refuses to commit with "state-sync-guard" — You staged docs/work/_state.json but it's not byte-identical to pnpm work rebuild-state output. Run pnpm work rebuild-state && git add docs/work/_state.json and try again.

Tests fail in @repo/turbo-generators with Vitest worker timeouts — Known flaky on slow machines. Re-run; if persistent, increase the turbo-generators package's vitest testTimeout.

pnpm work dispatch --execute errors with "ANTHROPIC_API_KEY required" — You're trying to run the sandcastle orchestrator without an agent API key. Set the env var, or run pnpm work dispatch (no flag) to just print the plan.

Where to read next

Once you've got pnpm dev running:

AGENTS.md — package map, boundary rules, per-package conventions
CLAUDE.md — full convention reference (manifest-first ordering, factory patterns, instrumentation rules)
docs/guides/conformance-quickref.md — daily manifest + gates reference
docs/guides/tdd-workflow.md — red-green-refactor with the gate stack
docs/guides/scaffolding-a-feature.md — pnpm turbo gen feature reference
docs/guides/adding-a-feature.md — end-to-end walkthrough
docs/architecture/agent-first-workflow-and-conformance.md — the full design
docs/architecture/feature-conformance-explainer.html — interactive explainer (open in browser)

For deeper topics:

docs/guides/events-and-jobs.md — cross-feature events (requires gen core-package events)
docs/guides/realtime.md — Socket.IO channels (requires gen core-package realtime)
docs/guides/audit-and-compliance.md — DPA-compliant audit logging (requires gen core-package audit)
docs/guides/frontend-work-shape.md — atomic design + Storybook conventions
docs/guides/infrastructure-work-shape.md — ADR-first flow for new infrastructure

Common pitfalls

Skipping the generator. Always run pnpm turbo gen <kind> before hand-rolling. Generators emit the canonical shape; the CI scaffold-drift check will fail on hand-rolled features.
Forgetting pnpm work rebuild-state after editing docs/work/ markdown. The pre-commit hook handles this automatically when you stage markdown; only matters if you push without committing.
Bypassing --no-verify on commits. The pre-commit hook catches drift early. If it's blocking a legitimate change, fix the underlying issue, not the hook.
Hand-editing _state.json. Don't. The state-sync-guard refuses commits that drift from rebuild output. Edit the markdown; let the rebuild script propagate.
Committing .env. It's gitignored. Use .env.example for new vars.

For deeper philosophy: this template is built around the assumption that AI agents will author most feature work. The conformance system is designed as an agent feedback loop. Latency-layered gates compound: 0s + <1s + 3s + 120s + 60s. The faster the inner loop, the more iterations agents can make per task.

If you're a human contributor, the same workflow applies — the gates aren't punitive, they're navigational aids.

23 KiB Raw Blame History Unescape Escape

Developer Runbook

Prerequisites

First-time setup

Daily commands

Environment variables

Required for pnpm dev

Optional — app URLs (defaults work in dev)

Optional — Sentry observability (no DSN = no-op tracer/logger)

Optional — Git commit SHA for releases

Optional — core-audit (only when gen core-package audit is scaffolded)

Optional — sandcastle dispatch (only when running pnpm work dispatch --execute)

The agent-first workflow

When adding a new feature

When adding cross-feature primitives

Tracking work

The five conformance gates

Using Sandcastle for agent dispatch

When to use Sandcastle

When NOT to use Sandcastle

Prerequisites

The dispatch flow

Worked example — dispatch a real task

Troubleshooting Sandcastle

Cost-aware variant: planning-only loop

Troubleshooting

Where to read next

Common pitfalls

23 KiB

Raw Blame History

Required for `pnpm dev`

Optional — core-audit (only when `gen core-package audit` is scaffolded)

Optional — sandcastle dispatch (only when running `pnpm work dispatch --execute`)