Files
agentic-dev-template/docs/architecture/agent-first-workflow-and-conformance.md
Danijel Martinek bae4b66fa4 refactor(work): drop date prefixes + move _state.json into _system/
Convention shift: epic folders + PRD filenames + frontmatter id
fields are now bare slugs. The created: timestamp (Phase 2) carries
the date; folder names don't repeat it. A future <task-id>-<slug>
shape (e.g. ClickUp) lands cleanly when that integration ships.

Renames (git mv preserves history):
- docs/work/2026-05-13-binder-wrap-helper/
    -> docs/work/binder-wrap-helper/
- docs/work/2026-05-14-library-evaluation-policy/
    -> docs/work/library-evaluation-policy/
- docs/work/2026-05-14-ci-security-and-supply-chain/
    -> docs/work/ci-security-and-supply-chain/
- docs/work/prds/2026-05-13-binder-wrap-helper.prd.md
    -> docs/work/prds/binder-wrap-helper.prd.md
- docs/work/prds/2026-05-13-coverage-architecture.prd.md
    -> docs/work/prds/coverage-architecture.prd.md
- docs/work/prds/2026-05-14-library-evaluation-policy.prd.md
    -> docs/work/prds/library-evaluation-policy.prd.md
- docs/work/prds/2026-05-14-ci-security-and-supply-chain.prd.md
    -> docs/work/prds/ci-security-and-supply-chain.prd.md

Frontmatter updates inside the renamed files: epic id, epic prd,
story epic, PRD id, PRD builds-on all drop date prefixes.

System folder + state file move:
- New docs/work/_system/ holds framework-managed state.
- docs/work/_state.json -> docs/work/_system/_state.json.
- state-builder.mjs adds _system to SKIP_FOLDERS.
- cli.mjs + state-sync-guard.mjs + .husky/pre-commit point at the
  new path.

template-reset-v1 epic deleted entirely (one-off cleanup epic from
the pre-date-convention era; status was already done).

Generator-template updates (so new artifacts ship in the right
shape):
- .sandcastle/decomposer.prompt.md emits bare-slug folder names +
  ISO created: timestamp.
- .claude/skills/to-prd/SKILL.md template uses bare-slug filename +
  bare-slug id field + ISO created: timestamp.

Doc reference updates: glossary, runbook, agent-first-workflow-
and-conformance, reviewer prompt, ADR-020, ADR-022, ADR-023 all
point at the new paths/slugs.
2026-05-14 21:16:51 +02:00

30 KiB
Raw Blame History

title, status, created, authors, related
title status created authors related
Agent-first development workflow + feature conformance design 2026-05-12
danijel
claude
docs/architecture/feature-conformance-explainer.html
docs/architecture/vertical-feature-spec.md
docs/guides/tdd-workflow.md
docs/guides/scaffolding-a-feature.md
CLAUDE.md

Agent-first development workflow + feature conformance

Why this document exists

template-vertical is being shaped around the assumption that AI coding agents will author most feature work. Humans set direction, write PRDs (with agent help), review diffs, and intervene when escalation is needed; agents do the bulk of the coding. This document defines:

  1. The feature-conformance enforcement system that gives agents tight, layered, machine-readable feedback on architectural drift.
  2. The agent-first workflow — manifest → contracts → tests → code — that the conformance system enforces.
  3. The local task system at docs/work/ that holds PRDs, epics, stories, and tasks as markdown, parseable by both humans and agents.
  4. The sandcastle orchestrator that dispatches implementer and reviewer agents per task, respecting a dependency DAG.

These four pillars are co-designed. The conformance system is the enforcement substrate; the workflow is the shape of work; the task system is the address space; sandcastle is the dispatch loop.

The conformance design is illustrated separately at docs/architecture/feature-conformance-explainer.html. This document complements that with the surrounding workflow + tooling.

Mental model

Pillar What it is Primary artifact
Conformance engine manifest + TS brands + ESLint + boot-time assertion + CI gate feature.manifest.ts per feature; _state.json snapshot of compliance
Agent workflow PRD → Epic → Story → Task; manifest-first ordering; TDD-per-slice docs/work/**/*.md
Local task system filesystem markdown, single state file, dispatchable docs/work/
Sandcastle orchestrator implementer + reviewer agents per task, DAG-respecting, retry-capped .sandcastle/ config + scripts/work-*.ts

The same artifacts are read by humans, AI implementers, AI reviewers, the orchestrator, and the pre-commit hooks. There is no separate "process layer."

Hierarchy

PRD                                    (initiative — one .prd.md)
└── Epic                               (large body of work — one folder + _epic.md)
    └── Story                          (one use case OR one technical capability)
        └── Task                       (one vertical slice = one PR = one commit)
            └── Subtask                (rare; only for unexpectedly complex slices)

Feature is metadata, not a hierarchy level. The use-case identifier (auth.signUp) already encodes the feature. ClickUp/Linear/etc. tags can carry it; in this system it sits in frontmatter.

Story type is metadatauser-story or technical-story — same hierarchy slot, different body template:

  • User story: As a <role>, I want <action>, so that <outcome>
  • Technical story: Goal / Why / Done when

File system

docs/work/
├── README.md                                # how this folder is used
├── _state.json                              # derived, committed, orchestrator-managed
├── _templates/
│   ├── prd.template.md
│   ├── epic.template.md
│   ├── user-story.template.md
│   ├── technical-story.template.md
│   └── task.template.md
├── prds/
│   ├── 2026-05-12-conformance-system.prd.md
│   └── ...
├── conformance-system-v1/                   # one folder per epic
│   ├── _epic.md
│   ├── 01-define-feature-helper/            # one folder per story
│   │   ├── _story.md
│   │   ├── 01-define-feature-helper-exists.task.md
│   │   ├── 02-instrumented-brand-attached.task.md
│   │   └── ...
│   └── ...
└── work-system-v1/
    └── ...

.sandcastle/
├── Dockerfile                               # extends existing CI image
├── implementer.prompt.md
├── reviewer.prompt.md
├── decomposer.prompt.md
├── prd-eliciter.prompt.md
└── .env.example

scripts/
├── work-prd-new.ts                          # invokes PRD elicitation skill
├── work-decompose.ts                        # PRD → epic + stories
├── work-decompose-tasks.ts                  # story → tasks
├── work-dispatch.ts                         # orchestrator loop
├── work-status.ts                           # human-readable status tree
└── work-rebuild-state.ts                    # regen _state.json from markdown

Naming conventions

  • Underscored system files (_epic.md, _story.md, _state.json, _templates/) — orchestrator-managed indexes or templates
  • Numeric prefix on filenames (01-, 02-) — execution order; doubles as sort key
  • <slug>.task.md — individual tasks
  • <date>-<slug>.prd.md — PRDs date-prefixed for chronological sort

File formats

PRD — docs/work/prds/<date>-<slug>.prd.md

---
id: 2026-05-12-conformance-system
title: Feature Conformance System
type: prd
status: draft | in-review | approved | superseded
author: danijel
elicitation-session: <agent-session-id>
created: 2026-05-12
---

## Problem

What's broken or missing today? Who hurts because of it?

## Goal

What state are we trying to reach?

## In scope

- ...

## Out of scope

- ...

## Constraints

- ...

## Success criteria

- ...

## Requirements

- R1: ...
- R2: ...

## Open questions

- Q1: ...

Authoring flow:

  1. Human runs pnpm work prd-new "<one-line idea>"
  2. Agent invokes the PRD elicitation skill — runs a question-driven interview with the human (similar in shape to superpowers:brainstorming) until it has enough context across Problem / Goal / Scope / Constraints / Success / Requirements
  3. Agent drafts PRD with status: draft
  4. Human reviews, edits, flips to status: approved
  5. Decomposer refuses to run on draft PRDs.

Epic — <epic-slug>/_epic.md

---
id: conformance-system-v1
prd: 2026-05-12-conformance-system
title: Conformance system v1
type: epic
status: todo | in-progress | done | cancelled
features: [cross-cutting]
created: 2026-05-12
target: 2026-Q3
---

## Goal

Build the feature-conformance enforcement system so AI agents get
layered, sub-second feedback on drift between manifest and code.

## Why

(brief — link to PRD for detail)

## In scope

- ...

## Out of scope

- ...

## Stories

- [ ] [01 — defineFeature helper + Instrumented brand](01-define-feature-helper/_story.md)
- [ ] [02 — Boot assertions](02-boot-assertions/_story.md)
- ...

Story (technical) — <epic>/<story>/_story.md

---
id: 01-define-feature-helper
epic: conformance-system-v1
title: defineFeature helper + Instrumented brand
type: technical-story
status: todo | in-progress | done
feature: core-shared
depends-on: []
blocks: [02-boot-assertions, 05-generator-updates]
---

## Goal

Manifest helper + brand types enable type-level enforcement that every
use-case binding is wrapped with `withSpan` + `withCapture`
(and `withAudit` when mutating).

## Why

Compile-time feedback is the cheapest layer and the foundation every other
milestone reads.

## Done when

Compile-time TS2322 fires at the IDE when an unwrapped factory is bound
through `ProductionUseCase<...>`.

## In scope

- `defineFeature` helper signature + tests
- Brand types: `Instrumented<F>`, `Captured<F>`, `Audited<F>`
- Wiring brands into existing wrappers (no API changes)
- `auth` as the reference feature using the new pattern

## Out of scope

- Migration of other features (each is its own story)
- Boot-time `assertConformance` (story 02)
- ESLint rules consuming the brands (story 03)

## Tasks

- [ ] [01 — defineFeature helper exists](01-define-feature-helper-exists.task.md)
- [ ] [02 — Instrumented brand attached via withSpan](02-instrumented-brand-attached.task.md)
- ...

Story (user) — same skeleton, body uses As a / I want / So that

---
id: 01-sign-up
epic: auth-v1
title: Sign up with email and password
type: user-story
status: todo
feature: auth
depends-on: []
---

## As a / I want / So that

**As a** visitor
**I want** to create an account with email and password
**So that** I can access member-only content

## In scope

- Email/password sign-up flow
- Password hashing
- Audit + event emission on success
- tRPC procedure exposure

## Out of scope

- OAuth sign-up (separate story)
- Email verification (separate story)

## Tasks

- [ ] [01 — reject invalid email format + scaffold](01-reject-invalid-email-format.task.md)
- [ ] [02 — reject duplicate email](02-reject-duplicate-email.task.md)
- ...

Task — <epic>/<story>/<slug>.task.md

---
id: 02-instrumented-brand-attached
story: 01-define-feature-helper
epic: conformance-system-v1
title: Attach Instrumented<F> brand via withSpan
type: task
status: todo | ready | in-progress | done | escalated
depends-on: [01-define-feature-helper-exists]
blocks: [06-signin-rebound-via-branded-slot]
sandbox: default
max-attempts: 3 # default; override per task
attempts: { implementer: 0, reviewer: 0 }
---

## Goal

Attach the `Instrumented<F>` brand to functions returned by `withSpan`.

## Why this matters

The brand is the type-level seam the binding signature checks. Without it,
the compiler can't tell a wrapped factory from an unwrapped one.

## Acceptance criteria

- [ ] `Instrumented<F>` = `F & { readonly __instrumented: true }`
- [ ] `withSpan` return type is `Instrumented<typeof fn>`
- [ ] Brand re-exported from `@repo/core-shared/conformance`
- [ ] Test asserts wrapped function carries brand at the type level
- [ ] All existing tests still pass

## Out of scope

- Updating `with-capture` and `with-audit` (separate tasks)
- Refactoring `withSpan`'s existing signature beyond adding the brand
- Adding runtime brand markers (type-only)
- Renaming existing types or symbols

## Files likely touched

- `packages/core-shared/src/instrumentation/with-span.ts`
- `packages/core-shared/src/instrumentation/with-span.test.ts`
- `packages/core-shared/src/conformance/index.ts`

## Reviewer notes

Reject if brand is implemented with runtime tag rather than pure type.

State file — docs/work/_system/_state.json

A derived, committed, orchestrator-written index. Markdown is source of truth; _state.json is a fast-to-query mirror.

{
  "updated_at": "2026-05-12T16:42:00Z",
  "ready": ["02-instrumented-brand-attached"],
  "in_progress": [],
  "blocked": [],
  "escalated": [],
  "epics": {
    "conformance-system-v1": {
      "status": "in-progress",
      "ac_total": 47,
      "ac_completed": 8,
      "stories": {
        "01-define-feature-helper": {
          "status": "in-progress",
          "ac_total": 9,
          "ac_completed": 4,
          "tasks": {
            "01-define-feature-helper-exists": {
              "status": "done",
              "depends_on": [],
              "blocks": ["02-instrumented-brand-attached"],
              "ac_total": 4,
              "ac_completed": 4,
              "attempts": { "implementer": 1, "reviewer": 1 },
              "branch": "task/01-define-feature-helper-exists",
              "completed_at": "2026-05-12T14:23:00Z"
            }
          }
        }
      }
    }
  }
}

Rules for _state.json

  1. Committed to git. Audit trail visible in PRs.
  2. Single writer: orchestrator + pre-commit hook only. No agent writes it directly.
  3. Derived from markdown. Regenerable any time via pnpm work rebuild-state.
  4. Canonical formatting. Sorted keys, stable indentation, no trailing whitespace. Pre-commit normalizes via Prettier.
  5. Merges serialized. Orchestrator merges PRs one at a time. Parallel implementation in sandboxes is fine; only merge step is sequential.
  6. Pre-commit regen. When any .task.md / .story.md / _epic.md is staged, the hook regenerates _state.json from the markdown, re-stages it, and lets the commit proceed. The hook only blocks the commit if regeneration itself fails (e.g. malformed frontmatter, broken depends-on reference). This makes the markdown the unambiguous source of truth: if humans edit checkboxes directly, the JSON quietly catches up.

Scope guards

Level In scope Out of scope
PRD required required
Story required required
Task implicit (= AC list) optional but encouraged

The reviewer agent explicitly checks the task's Out of scope section against the diff. Rejects if the diff touches anything declared out of scope. This is the cheapest possible enforcement of "don't over-engineer" — pure text match, no AST needed.

Conformance system integration

The four enforcement layers (detailed in feature-conformance-explainer.html):

Layer Latency Catches
TypeScript brands 0s forgotten withSpan / withAudit; manifest ↔ binding-slot type mismatch
AST-aware ESLint <1s manifest ↔ code drift; undeclared bus.publish / auditLogger.log; required cores not installed
Boot assertion (assertConformance) ~3s binding type-casts that hid unwrapped factories; manifests edited without rebinding
CI drift gate (pnpm conformance) ~120s orphan event consumers; scaffold drift from generator; required-cores ↔ workspace mismatch

How conformance interacts with tasks

When a task adds an audit emission (e.g. audits: ["user.created"]):

  1. Agent edits feature.manifest.ts
  2. The binding's branded slot type now demands Audited<F> — TS2322 if the wrapper is missing
  3. Agent adds withAudit(...) in bind-production.ts → TS goes quiet
  4. Agent adds auditLogger.log(...) in the use-case factory → ESLint goes quiet
  5. Pre-commit pnpm conformance confirms all four layers pass
  6. PR submitted

Each step gives sub-second feedback. The agent's iteration loop is dominated by think + write, not by waiting for feedback.

Workflow ordering (per task)

For any new use case or new behavior:

  1. Manifest — declare the use case (or update audits/publishes/consumes if the task adds them). Pure declaration.
  2. ContractsxInputSchema, xOutputSchema, IXUseCase type alias in the use-case file. Factory body throws "not implemented" if not yet written.
  3. Tests (red) — import contracts; write failing assertions that match the AC bullet.
  4. Implementation (green) — fill factory body, repository, binding, until tests pass.

For incremental work on an existing use case, step 1 is often a no-op (manifest already declared). For the first slice of a new use case, all four steps happen in one commit.

Work shapes

The Epic / Story / Task hierarchy holds for everything; the inner workflow shape varies with the kind of work. Three shapes are recognised:

Shape Default home Manifest involvement Test gates
Backend feature packages full (use cases, audits, publishes, consumes, jobs, realtime) type-check + lint + conformance + unit/integration
Frontend @repo/core-ui and features/<feature>/src/ui/ partial — pages consume use cases via controllers type-check + lint + component tests + Playwright screenshot (CI)
Infrastructure core packages, apps/*/server/, docker-compose.yml, .github/, ADRs declarative — requiredCores, bind context type-check + lint + conformance + ADR review

The default shape is backend — what the rest of this doc describes. The two adapted shapes are summarised below; operational detail lives in their guides.

Frontend (see docs/guides/frontend-work-shape.md)

  • Atomic design tiers — atoms / molecules / organisms / templates / pages, generated via pnpm turbo gen core-ui-component. Tier-direction enforced by a new ESLint rule (atomic-tier-import-direction).
  • Storybook is the spec. Each AC bullet on a UI task maps to a story variant or a Storybook play function. Story files become the shared visual contract between human, implementer, and reviewer.
  • Two test layers:
    • Component tests (Vitest + Testing Library, or play on stories) — pre-commit gate, <5s
    • Visual regression via Playwright screenshot tests — CI gate, 30120s, blocks PR merge on unapproved visual diffs
  • Adapted four-step ordering for a pure UI slice:
    1. Story file (the visual spec — analogous to the manifest entry for backend work)
    2. Contracts — props interface, variant types
    3. Tests (red) — component test + a default story
    4. Implementation (green) — make the component render and pass tests
  • Reviewer agent uses the existing Storybook MCP at http://localhost:6006/mcp to read existing components, list variants, and verify story coverage against the task's AC. Visual diff verdicts come from the Playwright screenshot CI step.
  • Story split for page-level features:
    Epic: auth-v1
    ├── Story (user):      auth.signUp use case           ← backend slices
    ├── Story (technical): SignUpForm component(s)        ← depends-on: signUp use case
    └── Story (technical): /sign-up page composition + E2E ← depends-on: SignUpForm
    
    Each depends-on edge enforces sequential dispatch by the orchestrator.

Infrastructure (see docs/guides/infrastructure-work-shape.md)

  • ADRs precede infrastructure work the same way PRDs precede feature work — decision first, code second. ADRs live at docs/adr/NNN-<slug>.md.
  • Two categories:
    • New optional core packagepnpm turbo gen core-package <name>. Generator + conformance already accommodate (via requiredCores in manifests). No conformance extensions required.
    • New infrastructure layer (Redis, CDN, alternative CMS, additional message bus, …) — ADR + integration PRD + multiple stories.
  • ADR authoring flow:
    1. Human runs pnpm work adr-new "<one-line proposal>"
    2. Dedicated ADR elicitation skill interviews the human on Context / Drivers / Considered options / Trade-offs / Decision / Consequences (similar shape to the PRD elicitation skill but distinct template + heuristics)
    3. Agent drafts ADR at docs/adr/NNN-<slug>.md with status: proposed
    4. Human reviews, flips to status: accepted (or rejected / superseded)
    5. Accepted ADR(s) trigger integration PRD(s); the PRD flow proceeds normally
  • Conformance extensions for infra:
    • core-package-shape-conforms-to-generator — extends milestone iv's scaffold-drift check to core packages
    • required-cores-in-workspace — manifest declarations must match pnpm-workspace.yaml (already in milestone iv)

Two elicitation skills now sit at the funnel mouth — one for PRDs, one for ADRs. Same interview-style intake; different templates and decision frameworks.

Pre-commit gates

When a commit lands in the sandbox or locally:

  1. Type-check — brand satisfaction, manifest typing
  2. Lint — manifest ↔ code rules, in-file shape rules, pattern restrictions
  3. Conformance scriptpnpm conformance (boot-style assertion at static scope)
  4. Tests for changed featurepnpm test --filter @repo/<feature> passes
  5. _state.json ↔ markdown sync — pre-commit regen verifies consistency

Tests for all features are NOT required to pass at pre-commit (that's CI's job). The gate enforces local soundness without blocking work in unaffected areas.

Agent roles

PRD eliciter agent

  • Skill: dedicated PRD elicitation (interview-style, similar shape to superpowers:brainstorming)
  • Inputs: short brief from human (pnpm work prd-new "<idea>")
  • Behavior: asks questions one at a time, builds shared understanding across Problem / Goal / Scope / Constraints / Success / Requirements
  • Output: <date>-<slug>.prd.md with status: draft
  • Hand-off: human reviews, flips to status: approved

ADR eliciter agent

  • Skill: dedicated ADR elicitation (interview-style; distinct from PRD elicitation)
  • Inputs: short brief from human (pnpm work adr-new "<proposal>")
  • Behavior: drives the conversation across Context / Drivers / Considered options / Trade-offs / Decision / Consequences. Pushes the human to articulate alternatives explicitly before settling on a decision.
  • Output: docs/adr/NNN-<slug>.md with status: proposed
  • Hand-off: human reviews, flips to status: accepted (or rejected / superseded)
  • Accepted ADRs are the trigger for downstream integration PRDs

Decomposer agent

  • Skill: structured PRD-to-epic-and-stories decomposition
  • Inputs: a PRD file with status: approved
  • Behavior: produces _epic.md and one _story.md per requirement, with story-level AC bullets that hint at task decomposition
  • Default scope: stories only. Task-level decomposition is a second pass: pnpm work decompose-tasks <story>
  • Does NOT write _state.json directly (orchestrator does that on next dispatch tick)

Implementer agent

  • Sandcastle dispatch with implementer.prompt.md
  • Inputs: a single task markdown file (full context)
  • Behavior: writes code + tests to satisfy the AC; runs pnpm test --filter and pnpm conformance locally; commits; pushes the sandbox branch
  • Read-only on task markdown. Returns structured output via sandcastle:
    {
      "status": "complete" | "blocked" | "needs-clarification",
      "ac_satisfied": [0, 1, 2, 3],
      "files_changed": ["packages/.../with-span.ts", "..."],
      "commit_sha": "abc123",
      "notes": "..."
    }
    
  • Does NOT edit .task.md, .story.md, _epic.md, or _state.json. The orchestrator translates the structured output into markdown checkbox flips and JSON state updates in a single post-merge commit.

Reviewer agent

  • Sandcastle dispatch with reviewer.prompt.md
  • Inputs: task markdown + diff from implementer
  • Behavior: verifies each AC bullet against the diff; checks Out of scope is respected; verifies tests cover AC bullets; runs pnpm conformance and pnpm test --filter
  • Returns structured output:
    {
      "decision": "approve" | "reject",
      "ac_verified": [0, 1, 2, 3],
      "scope_violations": [],
      "notes": "..."
    }
    
  • Does NOT edit anything in the repo.
  • For frontend tasks, the reviewer additionally:
    • Queries the Storybook MCP (http://localhost:6006/mcp) to verify story coverage and inspect rendered output
    • Treats the Playwright screenshot CI step's verdict as a required input — unapproved visual diffs trigger reject

Orchestrator

  • Plain TypeScript script (scripts/work-dispatch.ts)
  • Reads _state.json to find ready tasks (all deps done)
  • For each ready task:
    1. Marks task status: in-progress, regenerates _state.json, commits the marker
    2. Dispatches sandcastle implementer
    3. On implementer return: dispatches sandcastle reviewer with task + diff
    4. On reviewer approve: serial-merges to main, in one commit flips task checkbox, increments parent story checkbox if all tasks done, increments parent epic checkbox if all stories done, regenerates _state.json
    5. On reviewer reject: appends reviewer notes to the task's "Reviewer notes" section, increments attempts.implementer, re-dispatches (subject to max-attempts)
    6. On attempts.implementer >= max-attempts: marks status: escalated, posts a summary, stops dispatching
  • Continues until no ready tasks remain

Sandcastle config

.sandcastle/Dockerfile extends the existing CI image. To be identified during the work-system-v1 epic. Must include:

  • Node + pnpm at repo's pinned versions
  • pnpm install --frozen-lockfile baked in
  • Access to pnpm conformance, pnpm test, pnpm lint, pnpm typecheck
  • Git config for agent commits

Prompt templates use sandcastle's {{VAR}} substitution + !`cmd` injection:

  • implementer.prompt.md uses {{TASK_FILE_CONTENT}} and !`git log -1 --oneline` for context
  • reviewer.prompt.md uses {{TASK_FILE_CONTENT}} + {{DIFF}}
  • decomposer.prompt.md uses {{PRD_FILE_CONTENT}}
  • prd-eliciter.prompt.md uses {{INITIAL_BRIEF}} and runs the interview loop

Branch strategy: per-task feature branch (task/<task-id>), merged sequentially to main by the orchestrator.

Bootstrap order — conformance first

Tier 1 — Conformance system (human-driven)

Build the conformance system manually. docs/work/conformance-system-v1/ markdown files capture the work (practising the task format on real work) but no _state.json, no sandcastle dispatch, no orchestrator.

Stories in order (each itself a vertical slice — system code + generator update + doc update + applied to one feature):

  1. defineFeature helper + Instrumented brand — applied to auth.signIn
  2. Captured + Audited brands — wrappers updated
  3. assertConformance + boot wiring — rolled to all three apps
  4. AST-aware ESLint rules — 56 new type-aware rules
  5. CI drift gate (pnpm conformance)
  6. Generator emits manifest + contracts + test stubs
  7. Documentation rewrite — agent-workflow.md, CLAUDE.md, AGENTS.md, tdd-workflow.md
  8. Migrate auth feature to the new pattern (reference)

Tier 2 — Work system (human-driven, bootstraps automation)

Once conformance is in place, build the dispatch substrate:

  1. docs/work/ skeleton, README, templates
  2. _state.json schema + pnpm work rebuild-state
  3. Orchestrator script + DAG + retry-cap logic
  4. .sandcastle/ config (extends existing CI image)
  5. PRD elicitation skill
  6. ADR elicitation skill (separate skill, similar interview shape)
  7. Decomposer agent + prompts
  8. Implementer + reviewer prompts (with Storybook MCP wiring for frontend reviewer)
  9. pnpm work CLI surface (including work adr-new)
  10. Pre-commit hooks (state regen, conformance gate)
  11. Playwright screenshot test infrastructure (CI gate for frontend tasks)

Tier 3 — Migration + future work (dispatch-driven)

With both systems in place, remaining feature migrations and all future work flows through sandcastle:

  • Migrate blog, media, navigation, marketing-pages (one story each)
  • All new features authored via PRD → decompose → dispatch loop

Deferred decisions

These are explicitly deferred until the tier that needs them:

  1. Existing task-create skill (ClickUp mirror) — coexist or retire. Decide once the work system is in place. Frontmatter is open-ended so clickup-id can be added later if needed.
  2. Existing CI Docker image identity — identify and document during the .sandcastle/Dockerfile story.
  3. Per-epic state files vs. one global — start with one global _state.json. Move to per-epic only if serialized merges become a throughput bottleneck.
  4. Custom git merge driver for _state.json — start without; serialized merges should suffice. Add if needed.

Open questions (to revisit during implementation)

  • Q1: Single manifest registry per app, or per-feature? (From the explainer §10.) Lean: per-feature, with a tiny app-side aggregator.
  • Q2: How much of the manifest is generated vs hand-written? Lean: humans edit; ESLint flags mismatches without auto-fixing.
  • Q3: Inline symbol declaration in manifest, or registry mapping? Lean: registry holds the mapping, manifest stays content-only.
  • Q4: What happens when an optional core is absent? Lean: typed surface gates the field — audits: readonly never[] when core-audit is unbound.
  • Q5: Escape hatch for legitimate exceptions? Lean: // @conformance-skip: <rule> — <reason> comment honoured by ESLint + boot assertion, with allowlist growth gated in CI.

Acceptance criteria (for this whole design)

This design is "done" when:

  • docs/work/ exists with templates, _state.json schema, README
  • Conformance system v1 is implemented through all four enforcement layers
  • All five feature packages have manifests
  • All three apps run assertConformance at boot
  • pnpm conformance is a CI gate
  • turbo gen feature emits manifest + contracts + test stubs
  • .sandcastle/ config exists with implementer/reviewer/decomposer/eliciter prompts
  • PRD elicitation skill exists and is invocable
  • ADR elicitation skill exists and is invocable
  • Orchestrator (pnpm work dispatch) runs end-to-end on a real task
  • Frontend ESLint rules cover atomic-tier direction and story/test sibling presence
  • Playwright screenshot tests run as a CI gate for frontend work
  • Documentation reflects the manifest-first workflow (CLAUDE.md, AGENTS.md, docs/guides/)
  • Frontend + infrastructure work-shape guides exist at docs/guides/

References