Two separate sandbox blockers surfaced when the user tried
`pnpm work decompose --execute`:
1. **Container died on exec** — our Dockerfile had:
- WORKDIR /workspace + CMD ["bash"]
- No `agent` user (sandcastle exec's as UID:GID it built with)
- node:22-bookworm-slim (missing some build deps the install
script wants)
Sandcastle expects:
- A non-root `agent` user with home at /home/agent (sandcastle
does `git config --global --add safe.directory /home/agent/workspace`,
which fails if the user doesn't exist or the container exited)
- ENTRYPOINT ["sleep", "infinity"] so the container survives
the gap between sandcastle creating it and exec'ing in
Replaced .sandcastle/Dockerfile with the shape `sandcastle init`
would generate (verified against
node_modules/@ai-hero/sandcastle/dist/InitService.js):
- node:22-bookworm (full, not slim) for build tooling
- apt-get installs git + curl + jq
- corepack-pinned pnpm@9
- ARG AGENT_UID=1000 + AGENT_GID=1000; sandcastle's
build-image passes the host's UID/GID by default
- `groupmod -o -g $AGENT_GID node` + `usermod -o ... node` —
the `-o` (non-unique) flag is required because macOS hosts
have UID:501 GID:20, and GID 20 collides with Debian's
`dialout` group in the base image (without -o, groupmod
fails with "GID '20' already exists")
- USER ${AGENT_UID}:${AGENT_GID}, then install Claude Code CLI
via the official installer
- ENV PATH includes /home/agent/.local/bin
- WORKDIR /home/agent (sandcastle overrides per-run anyway)
- ENTRYPOINT ["sleep", "infinity"] keeps the container alive
2. **"Not logged in · Please run /login"** inside the container —
Claude Code on macOS stores credentials in the Keychain, NOT in
~/.claude/.credentials.json. Sandcastle's bind-mount of ~/.claude
finds nothing usable. Documented the workaround:
- README.md "Sandcastle setup (one-time)" — macOS-specific
block with the `security find-generic-password ... > ~/.claude/.credentials.json`
one-liner + chmod 600 + the security trade-off (plaintext
file vs keychain isolation)
- docs/guides/runbook.md "Using Sandcastle → Prerequisites" —
step 3 (Authentication) gets a "macOS quirk" subsection with
the same extraction one-liner + the API-key fallback as the
alternative path
- scripts/work/{dispatch,decompose}.mjs — when the sandcastle
error matches /Not logged in|Please run \/login/ AND we're on
darwin, the dispatcher prints the keychain-extraction
commands + the API-key fallback inline above the generic
"See runbook" line, so future agents discover the fix at the
failure site
The image rebuilds clean (`pnpm exec sandcastle docker
build-image`) at ~1.95GB and the container survives sandcastle's
exec — confirmed by reaching the "Not logged in" stage (which is
the next-layer issue, not the Dockerfile issue).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
docs/
Documentation hub for the template-vertical monorepo. Start here if you landed at docs/ directly.
For the project entry point, see ../CLAUDE.md (humans + agents) and ../AGENTS.md (package map + boundary rules).
Resolving terminology
glossary.md — canonical vocabulary for every cross-cutting term used in this repo (feature, use case, manifest, conformance, slice, coverage band, dispatch, etc.). When in doubt about what a term means here, check the glossary first.
How the documentation is organised
docs/
├── glossary.md # canonical vocabulary
├── architecture/ # design + invariants (rarely changes)
│ ├── overview.md # high-level architecture summary
│ ├── vertical-feature-spec.md # canonical feature design spec
│ ├── dependency-flow.md # workspace + import-graph reference
│ ├── template-tiers.md # must-have vs optional core packages
│ ├── agent-first-workflow-and-conformance.md # workflow + 5-gate design
│ ├── data-flow-explainer.html # interactive (single-file)
│ ├── di-explainer.html # interactive (single-file)
│ ├── feature-conformance-explainer.html # interactive (single-file)
│ └── audit-and-compliance-explainer.html # interactive (single-file)
├── decisions/ # ADRs (durable design decisions)
│ └── adr-001..adr-020.md # 20 ADRs, numbered in order accepted
├── guides/ # day-to-day how-to docs
│ ├── runbook.md # ← first time? read this end-to-end
│ ├── conformance-quickref.md
│ ├── coverage.md # 4-layer coverage cookbook (ADR-020)
│ ├── releasing.md # release-please workflow + CHANGELOG cookbook (ADR-021)
│ ├── tdd-workflow.md
│ ├── testing-strategy.md
│ ├── adding-a-feature.md # manual path
│ ├── scaffolding-a-feature.md # generator path (preferred)
│ ├── scaffolding-core-package.md
│ ├── scaffolding-core-ui-component.md
│ ├── events-and-jobs.md # requires `gen core-package events`
│ ├── realtime.md # requires `gen core-package realtime`
│ ├── audit-and-compliance.md # requires `gen core-package audit`
│ ├── frontend-work-shape.md
│ └── infrastructure-work-shape.md
└── work/ # local task system (PRD → Epic → Story → Task)
├── README.md # work-folder layout + PRD lifecycle
├── _state.json # derived index (orchestrator-managed)
├── prds/<date>-<slug>.prd.md # PRDs
└── <epic-slug>/... # one folder per epic
Document types
| Type | Where | Purpose | Lifetime |
|---|---|---|---|
| Glossary | glossary.md |
One sentence per term; flagged ambiguities | Long-lived |
| Architecture | architecture/ |
Design invariants, system shape | Long-lived |
| ADR | decisions/adr-NNN-*.md |
Single decision: context → choice → consequences | Long-lived |
| Guide | guides/ |
How-to, reference, troubleshooting | Updated as features evolve |
| PRD | work/prds/*.prd.md |
Implementation seed for one epic | draft → in-review → approved → shipped |
| Epic / Story / Task | work/<epic>/... |
Workflow artifacts | Created → in-progress → done |
When to put what where
- A new architectural decision (events, audit, instrumentation choices, etc.) → ADR. Number is
001 + max(existing). - A new how-to (cookbook, troubleshooting, "how do I do X") → guide.
- A new vocabulary term that's cross-cutting → glossary, alphabetically grouped.
- A new initiative (multi-task feature work) → PRD seed. The decomposer refuses to run on
draft— flip toapprovedafter human review.pnpm work prd-ship <id>auto-flips toshippedon epic completion. - A new interactive diagram → drop a single-file HTML at
architecture/<name>-explainer.html; cross-link from the other explainers + fromarchitecture/overview.md's Interactive explainers section.
See also
../CLAUDE.md— entry point for agents../AGENTS.md— package map + boundary rules + agent-driven development overview../README.md— project-level README
Conventions
- Markdown over HTML unless the doc is genuinely interactive (the 4 architecture HTMLs).
- Reference ADRs by ID (
ADR-NNN) rather than path so renames don't break links. - Cross-reference paths sparingly — they rot. Prefer naming the doc (e.g. "see the coverage guide").
- Don't duplicate — if two docs describe the same thing, consolidate or pick a single source of truth and link from the other.
- Date PRDs + ADRs in the file name (
YYYY-MM-DDprefix for PRDs) and in their frontmatter. - Each ADR cites at least one related ADR when extending or building on prior decisions.