Adapts mattpocock/skills/engineering/improve-codebase-architecture to
this repo. Four files at .claude/skills/improve-codebase-architecture/:
SKILL.md (104 lines):
- Explore -> Present candidates -> Grilling loop process
- "Hard constraints (do not propose violations)" section
enumerating ADRs 006/008/010/012/013/014/015/017/020/021 that
bound the design space
- Repointed at docs/glossary.md (not CONTEXT.md) and
docs/decisions/ (not docs/adr/)
- Exploration shortcuts specific to this repo: pnpm fallow,
pnpm coverage:diff, feature.manifest.ts, pnpm turbo boundaries
- Grilling loop side-effects target the right glossary section
and the next available ADR number (currently 022)
DEEPENING.md (93 lines):
- 4 dependency categories mapped to this repo's reality:
Cat 1 (in-process) -> entities/use-cases/presenters
Cat 2 (local-substitutable) -> our existing real + mock
adapter pattern (every port has both; mocks ARE stand-ins)
Cat 3 (remote but owned) -> cross-feature events via
IEventBus (E0/E1 rules)
Cat 4 (true external) -> Payload, Sentry/OTel, socket.io
(each constrained to its vendor-isolation seam by ADR)
- Seam discipline section recognises DI symbols + manifest entries
as concrete seams alongside .interface.ts files
- Testing strategy: replace not layer (matches ADR-020 L0 + L1)
- Conformance check command list at the end (typecheck, lint,
test --coverage, conformance, fallow:audit, coverage:diff)
INTERFACE-DESIGN.md (66 lines):
- Parallel sub-agent "Design It Twice" pattern preserved
- Every sub-agent brief MUST include glossary terms + ADR
constraints + manifest awareness
- Output items extended with "Manifest + binder impact" and
"ADR conflicts (if any)"
- Comparison axes include conformance impact + coverage delta
- Cross-feature moves flag release-please version-bump
implications (per ADR-021 commit-path targeting)
LANGUAGE.md (79 lines):
- Matt's 7 abstract terms preserved (module, interface,
implementation, depth, seam, adapter, leverage, locality)
- New "Mapping to this repo's identifiers" table — abstract
term -> concrete file shape (e.g. seam -> *.interface.ts +
DI symbol + manifest entry + <gen:*> anchor)
- Rejected framings extended with our reserved meanings
("boundary" stays the ESLint workspace-tag term; "service"
stays the DI port term)
Per user follow-up: vocabulary anchored so that "module" defaults
to "feature" in this repo (since features are our primary unit of
organisation). Abstract refactor sense survives only when the cross-
scale abstraction is the point. Glossary.md updated:
- "Feature" entry adds the "module = feature in refactor sense"
cross-link
- New "Architecture refactor vocabulary" section with 9 terms
(Module, Interface (refactor sense), Implementation, Depth,
Seam, Adapter, Leverage, Locality, Deletion test, Deepening)
— all framed so feature is the primary instance
- Flagged ambiguities entry for "module" rewritten to capture the
three coexisting senses (workspace package / Node ESM / refactor
vocabulary defaulting to feature); new entries for "seam" and
"adapter" to prevent drift with the existing "boundary" / "service"
/ "scope" reservations
Hooks updated:
- session-start.sh skills line lists the new skill
- prompt-context.sh adds a 10th keyword group firing on
refactor / deepening / shallow / architecture / seam / adapter /
interface design / design it twice — inject points at SKILL.md
+ summarises the vocabulary and hard constraints
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
9.5 KiB
name, description
| name | description |
|---|---|
| improve-codebase-architecture | Find deepening opportunities in this repo, informed by docs/glossary.md and the ADRs in docs/decisions/. Use when the user wants to improve architecture, find refactoring opportunities, consolidate tightly-coupled modules, or make the codebase more testable and AI-navigable. Respects the conformance system, boundary rules, and the 21 ADRs that govern shape. |
Improve Codebase Architecture
Surface architectural friction and propose deepening opportunities — refactors that turn shallow modules into deep ones. The aim is testability and AI-navigability, scoped to what this template's existing rules permit.
Glossary
Use these terms exactly in every suggestion. Consistent language is the point — don't drift into "component," "service" (we use that for DI ports specifically), or "boundary" (we use that for ESLint workspace-tag rules). Full definitions in LANGUAGE.md.
- Module — in this repo defaults to "feature" (
packages/<name>/). The abstract definition (anything with interface + implementation) still applies at narrower scales — a use case, controller, repository/service port, binder, or core package can also be a module — but say "feature" whenever that's the scope. Reach for "module" only when comparing depth/leverage across scales. - Interface — everything a caller must know to use the module: types, invariants, error modes, ordering, schemas, DI shape. Not just the TypeScript type signature.
- Implementation — the code inside.
- Depth — leverage at the interface: a lot of behaviour behind a small interface. Deep = high leverage. Shallow = interface nearly as complex as the implementation.
- Seam — where an interface lives; a place behaviour can be altered without editing in place. In this repo seams take a concrete shape:
*.interface.tsfiles, DI symbols, manifest declarations, and<gen:*>anchors. - Adapter — a concrete thing satisfying an interface at a seam. In this repo:
<x>.repository.ts(Payload real impl) vs<x>.repository.mock.tsvsRecording*test doubles. - Leverage — what callers get from depth.
- Locality — what maintainers get from depth: change, bugs, knowledge concentrated in one place.
Key principles (see LANGUAGE.md for the full list):
- Deletion test: imagine deleting the module. If complexity vanishes, it was a pass-through. If complexity reappears across N callers, it was earning its keep.
- The interface is the test surface.
- One adapter = hypothetical seam. Two adapters = real seam.
This skill is informed by the project's domain model and architecture decisions. Read docs/glossary.md for project vocabulary and the relevant ADR(s) in docs/decisions/ before proposing anything in their territory.
Hard constraints (do not propose violations)
These are settled decisions — propose deepening WITHIN them, never against them:
- Factory-function use cases & controllers (ADR-012, ADR-013) — every use case is
(deps) => async (input) => output; every controller is one verb-noun pair per file with a co-locatedpresenter. - Schemas in the use-case file (ADR-013) —
xInputSchema/xOutputSchemacolocate with the factory; don't propose moving them to a separate module. - Per-feature DI containers (ADR-008) — don't propose a single global container.
- Five boundary tags + the dependency-direction matrix (ADR-006, ADR-010) — features may depend only on
core+tooling; cross-feature reactions go throughIEventBus(ADR-015). - Manifest-first ordering (ADR-012, ADR-020) — new use cases land manifest → contracts → tests → impl; don't propose collapsing the steps.
- Brand-based conformance —
Instrumented/Captured/Auditedare attached at DI bind time viawithSpan/withCapture/withAudit; don't propose moving the wrapping elsewhere. - Generator-first —
pnpm turbo gen <kind>is the entry point for new features/events/jobs/realtime channels. Don't propose hand-rolled scaffolding. - Conventional Commits (CLAUDE.md Key Conventions) — any refactor lands as conventional-commit messages.
- Hybrid versioning (ADR-021) — refactors that move code between feature packages have version + CHANGELOG implications.
If a proposed deepening would violate an ADR, surface it explicitly with the ADR number and a "worth reopening because…" justification — but only if the friction is real enough. Most should be silently scoped out.
Process
1. Explore
Read docs/glossary.md and any ADRs in the area you're touching first. Then walk the codebase noting friction. The primary unit of attention is the feature (packages/<name>/); narrower units (use cases, controllers, repositories) get attention when the friction lives at that scale.
- Where does understanding one concept require bouncing between many small files across
entities/,application/,infrastructure/,interface-adapters/inside a single feature? - Where is a feature shallow — its public surface (the
.+./ui+./apiexports) nearly as complex as its internal implementation? Or where inside a feature is a smaller unit shallow:- A
service.interface.tswith one method that wraps a single repository call. - A
presenterthat's just(x) => x. - A controller body that's just
useCase(parsed.data)with no transformation. - A repository wrapping another repository.
- A use case wrapping another use case.
- A
- Where have pure functions been extracted just for testability, but the real bugs hide in how they're called (no locality)?
- Where do tightly-coupled features leak across their seams? (e.g. a feature reaches into another feature's internals via deep import — though ESLint should catch this.)
- Which parts are untested, or hard to test through their current interface?
pnpm coverage:diffandpnpm fallowsurface candidates.
Useful exploration shortcuts in this repo:
pnpm fallow— dead exports, dupes, complexity hotspots, circular depspnpm fallow:audit— the AI-change audit; surfaces drift across recent editsgit log --oneline --follow -- <path>— change frequency is a depth signalcat packages/<feature>/src/feature.manifest.ts— declared surface of a featurepnpm turbo boundaries— workspace dependency graph
Apply the deletion test to anything you suspect is shallow: would deleting it concentrate complexity, or just move it? A "yes, concentrates" is the signal you want.
2. Present candidates
Present a numbered list of deepening opportunities. For each candidate:
- Files — which files / features / smaller units are involved (give exact paths)
- Problem — why the current architecture is causing friction
- Solution — plain English description of what would change
- Benefits — explained in terms of locality and leverage, plus how tests would improve
- ADR impact — any ADR this touches. If the proposed change conflicts with a current ADR, mark it explicitly: "contradicts ADR-NNN — worth reopening because…" (only when the friction warrants it).
- Manifest impact — if the change moves use cases / events / jobs / channels across features, the
feature.manifest.tsof each feature involved will need an update; flag this so the user knows the conformance gates will require manifest edits before code edits (manifest-first ordering).
Use docs/glossary.md vocabulary for the domain (use case, manifest, slice, feature, etc.) and LANGUAGE.md vocabulary for the architecture (module, seam, adapter, depth, leverage, locality).
Do NOT propose interfaces yet. Ask the user: "Which of these would you like to explore?"
3. Grilling loop
Once the user picks a candidate, drop into a grilling conversation. Walk the design tree with them — constraints, dependencies, the shape of the deepened module, what sits behind the seam, what tests survive.
Side effects happen inline as decisions crystallize:
- Naming a deepened module after a concept not in
docs/glossary.md? Add the term to the glossary right there — same discipline as thegrill-with-docsskill. Pick the appropriate section (Packages / Architecture layers / Feature building blocks / Conformance / Cross-feature / Instrumentation / Workflow / Releasing). - Sharpening a fuzzy term during the conversation? Update the glossary inline.
- User rejects the candidate with a load-bearing reason? Offer an ADR, framed as: "Want me to record this as ADR-NNN so future architecture reviews don't re-suggest it?" Only offer when the reason would actually be needed by a future agent to avoid re-suggesting the same thing. The next ADR number is
001 + max(existing)(currentlyADR-022). Our ADR shape:Context → Decision → Alternatives considered → Consequences → Related. - Refactor will move code between feature packages? Flag the release-please impact: both affected packages will bump versions on the next release PR.
- Want to explore alternative interfaces for the deepened module? See INTERFACE-DESIGN.md.
Related skills
grill-with-docs(.claude/skills/grill-with-docs/) — stress-tests plans against ADRs + glossary + manifests; share the same glossary-update discipline.to-prd— if the deepening is large enough to merit a multi-task epic, materialize it as a PRD.