Commit Graph

19 Commits

Author SHA1 Message Date
bae4b66fa4 refactor(work): drop date prefixes + move _state.json into _system/
Convention shift: epic folders + PRD filenames + frontmatter id
fields are now bare slugs. The created: timestamp (Phase 2) carries
the date; folder names don't repeat it. A future <task-id>-<slug>
shape (e.g. ClickUp) lands cleanly when that integration ships.

Renames (git mv preserves history):
- docs/work/2026-05-13-binder-wrap-helper/
    -> docs/work/binder-wrap-helper/
- docs/work/2026-05-14-library-evaluation-policy/
    -> docs/work/library-evaluation-policy/
- docs/work/2026-05-14-ci-security-and-supply-chain/
    -> docs/work/ci-security-and-supply-chain/
- docs/work/prds/2026-05-13-binder-wrap-helper.prd.md
    -> docs/work/prds/binder-wrap-helper.prd.md
- docs/work/prds/2026-05-13-coverage-architecture.prd.md
    -> docs/work/prds/coverage-architecture.prd.md
- docs/work/prds/2026-05-14-library-evaluation-policy.prd.md
    -> docs/work/prds/library-evaluation-policy.prd.md
- docs/work/prds/2026-05-14-ci-security-and-supply-chain.prd.md
    -> docs/work/prds/ci-security-and-supply-chain.prd.md

Frontmatter updates inside the renamed files: epic id, epic prd,
story epic, PRD id, PRD builds-on all drop date prefixes.

System folder + state file move:
- New docs/work/_system/ holds framework-managed state.
- docs/work/_state.json -> docs/work/_system/_state.json.
- state-builder.mjs adds _system to SKIP_FOLDERS.
- cli.mjs + state-sync-guard.mjs + .husky/pre-commit point at the
  new path.

template-reset-v1 epic deleted entirely (one-off cleanup epic from
the pre-date-convention era; status was already done).

Generator-template updates (so new artifacts ship in the right
shape):
- .sandcastle/decomposer.prompt.md emits bare-slug folder names +
  ISO created: timestamp.
- .claude/skills/to-prd/SKILL.md template uses bare-slug filename +
  bare-slug id field + ISO created: timestamp.

Doc reference updates: glossary, runbook, agent-first-workflow-
and-conformance, reviewer prompt, ADR-020, ADR-022, ADR-023 all
point at the new paths/slugs.
2026-05-14 21:16:51 +02:00
83b6119ca6 docs(sandcastle): add CI security checks section to reviewer prompt
Instructs the reviewer agent to inspect Socket critical findings and
CodeQL error-severity findings via gh run view before issuing a verdict.
Composes with the existing library-trace check — all three must pass for
approval.
2026-05-14 18:03:36 +00:00
26bcbb7a91 feat(scripts): add --staged-against flag to library-decisions check
Adds `--staged-against <base>` CLI flag to `check.mjs` so the reviewer
agent can compare `git diff <base>...HEAD` instead of the git index.
This gives the sandcastle reviewer a CI-compatible code path that works
in its clean sandbox where `git diff --cached` may be empty.

Appends a "Library-trace check" section to `.sandcastle/reviewer.prompt.md`
instructing the reviewer to run the command before issuing a verdict.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 05:57:10 +00:00
eadbb7ebd9 fix(work): emit completion signal to stop sandcastle agent loops
Sandcastle re-invokes agents up to maxIterations even when the work is
already done — the decomposer was looping 4x re-writing the same epic
on every dispatch. Two halves to the fix:

- Pass completionSignal: "<promise>COMPLETE</promise>" explicitly on
  all three run() calls (decompose, implementer, reviewer). Makes the
  contract visible alongside maxIterations instead of relying on
  sandcastle's default.
- Append a "Signal completion (required)" section to each prompt
  telling the agent to emit the literal marker as its final line when
  the work is genuinely done, plus a "do NOT emit if..." list to
  discourage premature signaling.
2026-05-13 19:11:44 +02:00
fd8265cd29 docs(sandcastle): decomposer + reviewer enforce vertical-slice tasks
The user surfaced that the binder-wrap-helper epic's stories
decomposed into horizontal sub-steps (read 3 files → write helper
→ write test → export → typecheck → coverage), not vertical
slices. Per the glossary's slice = task = PR = commit rule, every
checkbox should land as one green commit.

.sandcastle/decomposer.prompt.md:
  - New "The slice rule (non-negotiable)" section near the top
    defining the three constraints every task must satisfy: one
    green commit; exercises a layer; independently meaningful.
  - New "Tasks that are FORBIDDEN" list naming the anti-patterns
    the previous output exhibited (read a file as a task; write
    test without impl; standalone gate runs; standalone export;
    sub-step decomposition of a single slice).
  - New "Tasks that are CORRECT" list with examples drawn from
    this codebase (gen invocation, full use-case slice, per-feature
    binder migration, audit emission, bindAll wiring).
  - New paragraph on "Manifest-first ordering INSIDE a task" —
    the 4-step ordering (manifest → contracts → red test → green
    impl) is what the implementer does within one task, not a
    multi-checkbox decomposition.
  - Constraints section gains two new bullets:
      * Prefer FEWER but FATTER tasks (one per vertical slice)
        over MANY thinner sub-steps
      * Self-check: imagine the commit each checkbox produces;
        do all gates pass on that commit alone?

.sandcastle/reviewer.prompt.md:
  - New check #8 "Slice discipline" rejecting:
      * Multi-commit diffs where any intermediate commit has red
        gates
      * Sub-step shape that should have been separate tasks
      * Incomplete slices (use case w/o DI binding, manifest
        publish w/o publish site, controller w/o router wiring)

.gitignore: adds `.pnpm-store/` so a misconfigured pnpm install
that places the store inside the project doesn't stage thousands
of cache files.

The existing binder-wrap-helper stories were decomposed under the
old (unconstrained) prompt and need re-decomposing under the new
rule. That's a separate action — this commit fixes the prompts;
the existing epic stays as-is until you re-decompose.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 18:33:34 +02:00
7737358509 fix(sandcastle): make Dockerfile match sandcastle's expected shape + document macOS keychain quirk
Two separate sandbox blockers surfaced when the user tried
`pnpm work decompose --execute`:

1. **Container died on exec** — our Dockerfile had:
     - WORKDIR /workspace + CMD ["bash"]
     - No `agent` user (sandcastle exec's as UID:GID it built with)
     - node:22-bookworm-slim (missing some build deps the install
       script wants)
   Sandcastle expects:
     - A non-root `agent` user with home at /home/agent (sandcastle
       does `git config --global --add safe.directory /home/agent/workspace`,
       which fails if the user doesn't exist or the container exited)
     - ENTRYPOINT ["sleep", "infinity"] so the container survives
       the gap between sandcastle creating it and exec'ing in
   Replaced .sandcastle/Dockerfile with the shape `sandcastle init`
   would generate (verified against
   node_modules/@ai-hero/sandcastle/dist/InitService.js):
     - node:22-bookworm (full, not slim) for build tooling
     - apt-get installs git + curl + jq
     - corepack-pinned pnpm@9
     - ARG AGENT_UID=1000 + AGENT_GID=1000; sandcastle's
       build-image passes the host's UID/GID by default
     - `groupmod -o -g $AGENT_GID node` + `usermod -o ... node` —
       the `-o` (non-unique) flag is required because macOS hosts
       have UID:501 GID:20, and GID 20 collides with Debian's
       `dialout` group in the base image (without -o, groupmod
       fails with "GID '20' already exists")
     - USER ${AGENT_UID}:${AGENT_GID}, then install Claude Code CLI
       via the official installer
     - ENV PATH includes /home/agent/.local/bin
     - WORKDIR /home/agent (sandcastle overrides per-run anyway)
     - ENTRYPOINT ["sleep", "infinity"] keeps the container alive

2. **"Not logged in · Please run /login"** inside the container —
   Claude Code on macOS stores credentials in the Keychain, NOT in
   ~/.claude/.credentials.json. Sandcastle's bind-mount of ~/.claude
   finds nothing usable. Documented the workaround:
     - README.md "Sandcastle setup (one-time)" — macOS-specific
       block with the `security find-generic-password ... > ~/.claude/.credentials.json`
       one-liner + chmod 600 + the security trade-off (plaintext
       file vs keychain isolation)
     - docs/guides/runbook.md "Using Sandcastle → Prerequisites" —
       step 3 (Authentication) gets a "macOS quirk" subsection with
       the same extraction one-liner + the API-key fallback as the
       alternative path
     - scripts/work/{dispatch,decompose}.mjs — when the sandcastle
       error matches /Not logged in|Please run \/login/ AND we're on
       darwin, the dispatcher prints the keychain-extraction
       commands + the API-key fallback inline above the generic
       "See runbook" line, so future agents discover the fix at the
       failure site

The image rebuilds clean (`pnpm exec sandcastle docker
build-image`) at ~1.95GB and the container survives sandcastle's
exec — confirmed by reaching the "Not logged in" stage (which is
the next-layer issue, not the Dockerfile issue).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 18:02:34 +02:00
cd0a332443 docs: surface sandcastle image-build step (one-time setup)
Closes the gap the user hit running `pnpm work decompose --execute`:
sandcastle errored with `Image 'sandcastle:template-vertical' not
found locally. Build it first with 'sandcastle docker build-image'`,
but neither the README nor the runbook documented this step.

README.md: new "Sandcastle setup (one-time)" section after Quick
reference. Three commands (docker info, build-image, auth) — the
minimum needed to make dispatch work. Links to the runbook for the
full lifecycle.

docs/guides/runbook.md: Prerequisites in "Using Sandcastle" grow
from 4 to 5 items. New step 2 walks through `sandcastle docker
build-image`, quotes the exact "Image not found locally" error so
agents searching for the string land on the fix, and shows the
remove-image + rebuild flow for Dockerfile edits.

.sandcastle/README.md: new "Build the sandbox image (one-time)"
section parallel to the env section, cross-linking to the runbook.

scripts/work/decompose.mjs + scripts/work/dispatch.mjs: when the
sandcastle error message matches the "Image '.+' not found locally"
pattern, the dispatcher now prints the build-image command inline
above the generic "See runbook" line. The error stack itself remains
unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 17:51:30 +02:00
32d20872e3 feat(work): pnpm work prd-ship + auto-flip integration in sandcastle
Closes the PRD-lifecycle gap surfaced by the user: when sandcastle
finishes an epic's last task, the seed PRD should auto-flip from
approved -> shipped. Builds the mechanism, wires it into the work
CLI + state index + reviewer prompt + docs.

scripts/work/prd-ship.mjs (new):
  - parseFrontmatter / serializeFrontmatter — minimal YAML-ish parser
    sufficient for PRD frontmatter (scalar + list shapes)
  - flipPrdStatus — pure function: takes PRD text, returns new text
    with status=shipped + shipped=<date> + optional shipping-commits.
    Refuses to flip draft, idempotent fail-soft on already-shipped,
    rejects unexpected statuses
  - deriveShippingCommits — best-effort git log of the linked epic
    folder for the --auto-commits flag
  - findPrdPath — id -> path lookup under docs/work/prds/
  - runCli — wiring for `pnpm work prd-ship <id> [--commits|--auto-commits]`

scripts/work/prd-ship.test.mjs (new, 17 tests):
  - Frontmatter parser handles scalars + lists + missing frontmatter
  - flipPrdStatus covers all transitions + refusals + body/key preservation
  - findPrdPath + serializeFrontmatter coverage

scripts/work/state-builder.mjs:
  - Epic entries gain a `prd` field
  - New computeNeedsPrdShip surfaces epics done with PRD status not yet
    shipped: state.needs_prd_ship[] with action commands

scripts/work/cli.mjs:
  - New subcommand `pnpm work prd-ship <id>`

.sandcastle/reviewer.prompt.md:
  - "Epic close-out: PRD status flip" section instructing reviewer to
    check _state.json.needs_prd_ship and run the suggested action
  - JSON output extends with prd_shipped: "<id>" | null

docs/work/README.md:
  - "PRD lifecycle" section documenting the 4 statuses + auto-flip

Future PRDs follow the lifecycle automatically: decomposer refuses
draft, human flips to approved, sandcastle ships the epic, reviewer
runs prd-ship on the final task, PRD lands as shipped with its
commit trail.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 16:51:48 +02:00
fc27eef6eb docs(coverage): sync docs to shipped state + wire sandcastle prompts
Closes the staleness gap after the 10-commit coverage epic shipped.

Doc sync (item 1 from the user's choice):
  - CLAUDE.md Quick Start: adds pnpm coverage:aggregate / coverage:diff
    / mutate to the command listing
  - CLAUDE.md: new "Sibling architecture: coverage (ADR-020)" section
    after the conformance gate table — captures the 4-layer table +
    points at docs/guides/coverage.md + ADR-020 + says agents must run
    coverage:diff before reporting complete
  - AGENTS.md preamble: now lists coverage as a parallel multi-latency
    quality system alongside conformance, with the same gate / latency
    framing
  - PRD frontmatter: status draft -> shipped + shipped date +
    shipping-commits list (all 10 SHAs anchoring the trace)
  - PRD findings table: each row gets a Resolution column citing the
    commit that closed it; conclusion text updated to past tense
  - ADR-020 implementation phasing: rewritten as a status table with
    each step linked to the commit that shipped it + Boot-time
    assertFeatureConformance explicitly marked Deferred with rationale
  - docs/guides/coverage.md: removed "Boot wiring lands in the next
    story" line; replaced with the deferral rationale + clarified
    that two readers (vitest, coverage:diff) consume the manifest

Sandcastle prompts (item 2 from the user's choice):
  - .sandcastle/implementer.prompt.md: new "Coverage gates" section
    after the conformance-gates list, requiring `pnpm test --coverage`,
    `pnpm coverage:aggregate`, and `pnpm coverage:diff` to all pass
    before reporting `complete`. Machine-readable JSON shape of
    coverage:diff documented (status / uncovered[] / kind enum), with
    explicit instructions on how to interpret each kind. Allowlist
    expansion requires justification + test.
  - .sandcastle/reviewer.prompt.md: AC coverage relabeled to "AC
    coverage (acceptance criteria, not test coverage)" to disambiguate;
    new check #7 "Coverage gates (ADR-020)" requiring CI's
    Coverage — diff (L1) step green + per-layer thresholds met +
    no silent allowlist expansion + manifest band drift detection.

Effect: future agent runs through sandcastle now treat coverage as a
first-class blocking gate, parallel to conformance. PRs no longer
discover coverage failures only via CI; the implementer is required
to check before reporting done, and the reviewer is required to
verify.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 16:47:16 +02:00
e734a9e7a1 docs: subscription auth is the primary sandcastle flow, API key is fallback 2026-05-13 09:31:33 +02:00
793772a34d feat(sandcastle): Dockerfile installs Claude Code CLI for subscription auth 2026-05-13 09:27:22 +02:00
7ecb365e63 feat(sandcastle): implementer + reviewer prompts include fallow audit 2026-05-13 08:53:00 +02:00
1e7bd68b17 feat: install @ai-hero/sandcastle + minimal Dockerfile 2026-05-13 08:17:32 +02:00
5bf636e0b3 feat(sandcastle): reviewer prompt template (verifies generator usage) 2026-05-13 08:11:41 +02:00
e441d0f477 feat(sandcastle): implementer prompt template (manifest-first + generators) 2026-05-13 08:11:24 +02:00
4ec804107b feat(sandcastle): decomposer prompt template (generator-first task lists) 2026-05-13 08:11:02 +02:00
988667fc47 feat(sandcastle): ADR elicitation prompt template 2026-05-13 08:10:40 +02:00
b28d7a6f71 feat(sandcastle): PRD elicitation prompt template 2026-05-13 08:10:25 +02:00
7fc4c23036 feat(sandcastle): scaffold .sandcastle/ + README + env example 2026-05-13 08:10:04 +02:00