Files
agentic-dev-template/docs/decisions/adr-023-ci-security-and-supply-chain.md
Danijel Martinek 08bc19293a ci(release): attach CycloneDX SBOM to every GitHub release
Amends release-please.yml with conditional steps that run only when
release-please cuts a release:
- checkout + pnpm install to give @cyclonedx/cyclonedx-npm the full
  resolved workspace graph
- pnpm dlx @cyclonedx/cyclonedx-npm generates a CycloneDX 1.6 JSON SBOM
  named sbom-<tag>.cdx.json; --ignore-npm-errors is required because
  npm ls exits non-zero for dev-deps-of-dev-deps pnpm correctly elides
- softprops/action-gh-release@<SHA> (v3.0.0, Renovate-managed) attaches
  the file to the GitHub release as a downloadable asset

Adds ADR-023 §9 amendment documenting the step shape, rationale for
pnpm dlx (avoids lockfile per ADR-022), --ignore-npm-errors behaviour,
SHA pinning per ADR-023 §1, and the extended failure-mode table.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 11:31:08 +00:00

24 KiB

ADR-023 — CI security + supply-chain enforcement stack

Status: Accepted Date: 2026-05-14 Builds on: ADR-022 (library evaluation policy) Related: ADR-006 (vertical feature packages), ADR-010 (turbo boundaries), ADR-019 (sandcastle agent orchestration), ADR-021 (release-please versioning) Companion guide: docs/guides/ci-security.md (to be written; human reading-room)

Context

ADR-022 codified the library-evaluation policy: at adoption time, every new runtime dependency in a feature- or core-tier package is gated by 8 hard filters

  • 3 prompts and produces a permanent trace at docs/library-decisions/. That closes the decision gate. It does not close the drift gate. Once a library is in the lockfile, ADR-022 has nothing to say about:
  1. CVE disclosures against the current pinned version. A library that passes pnpm audit --audit-level=moderate clean at adoption can have a critical CVE published against it the next day. The trace's verification-commands snapshot goes stale silently.
  2. Supply-chain behavior compromise. The disclosed-CVE world only catches vulnerabilities that someone has filed. Packages with malicious behavior — event-stream (2018), ua-parser-js (2021), tj-actions/changed-files (2025), xz-utils (2024) — shipped malware that no CVE database had seen at the moment of compromise. CVE scanning is a lagging indicator.
  3. Maintainer-account compromise. A trusted upstream maintainer's npm account gets phished. The next 1.2.4 patch publishes a malicious post-install script. Every consumer pulling ^1.2.0 inherits it. Renovate or Dependabot will happily open a bump PR.
  4. GitHub Actions supply chain. This repo's 5 existing workflows pin their actions to major-version tags (actions/checkout@v4, pnpm/action-setup@v4, googleapis/release-please-action@v4). The tj-actions/changed-files incident demonstrated that a compromised maintainer can push a malicious tag and everyone pinned to @v4 silently inherits it. Major-tag pinning is documented insecure.
  5. License drift. Upstream packages occasionally relicense (Sentry went BSL on a major; Elasticsearch went SSPL). A 1.x → 2.x Renovate PR might silently move a previously MIT-licensed dep to a copyleft or source-available license that violates ADR-022's filter #1.
  6. EU-residency drift. A vendor (Sentry, PostHog, etc.) announces US-only changes mid-flight. The trace's eu-residency: ok snapshot becomes false; ADR-022's filter #6 has no way to detect this.

The repo's current security posture, audited 2026-05-14: zero security tooling. No Dependabot config, no Renovate, no CodeQL, no Snyk, no Trivy, no OSV-Scanner, no Socket, no gitleaks, no pnpm audit signatures step. GitHub Actions are pinned to major-version tags. The 5 existing workflows (ci.yml, coverage-snapshot.yml, mutation-nightly.yml, release-please.yml, sentry-pii-guard.yml) cover functional CI, but nothing surfaces a post-adoption supply-chain signal.

For a GDPR-bound EU-resident template that ships as agent-friendly infrastructure, this is the load-bearing gap that ADR-022 cannot close alone. The decision below extends ADR-022 with a continuous-validation counterpart and adds five orthogonal layers that catch the threat surface ADR-022's adoption-time gate doesn't see.

Decision

Adopt a four-pillar CI security and supply-chain enforcement stack: (1) Renovate-managed bumps + Action SHA pinning, (2) Socket-based supply-chain-behavior detection, (3) continuous trace revalidation extending ADR-022, (4) baseline GitHub-native gates (CodeQL + secret scanning + sigstore provenance). Each pillar composes with the existing 5-gate conformance pattern from ADR-012 — layered enforcement at declining latencies.

1. Renovate adoption (bumps + Action SHA pinning)

.github/renovate.json configures Renovate to manage all runtime + dev dependency bumps and to SHA-pin every GitHub Action invocation:

  • npm bumps — per-workspace package.json updates honored. Minor + patch bumps grouped by ecosystem cluster (e.g. one weekly PR for all @sentry/*, one for all @opentelemetry/*). Auto-merge enabled for green minor + patch PRs. Major bumps require human (or agent) review — see §3.
  • Dockerfile bumps.sandcastle/Dockerfile's node:22-bookworm-slim base image gets the same treatment as npm.
  • Action SHA pinningpinGitHubActionDigests rewrites every uses: <owner>/<repo>@<tag> to uses: <owner>/<repo>@<40-char-sha> # <tag> on first run. Subsequent Action releases produce bump PRs that update the SHA + the trailing comment in one diff.
  • Vulnerability alerts stay on the GitHub-native server-side surface (no Dependabot bump PRs). Server-side alerts compose with Renovate bumps: an alert may trigger a manual Renovate :rebase to accelerate a particular bump.

Renovate over Dependabot for this repo specifically because:

  • pnpm-workspace support is mature and per-workspace updates work from one config file (Dependabot requires verbose per-workspace blocks).
  • pinGitHubActionDigests is native (Dependabot SHA-pinning requires manual config).
  • PR grouping rules are more granular — one PR per ecosystem cluster instead of per-package noise.
  • Major/minor split + automerge is one-liner config (Dependabot requires a separate GitHub Action for automerge).

2. Socket.dev integration (supply-chain behavior detection)

Layered free-tier integration; no paid plan required:

  • Socket GitHub App installed on the repo. Posts risk-score comments on every PR that touches package.json / pnpm-lock.yaml. Free for OSS use.
  • socket-cli CI step in ci.yml's validate job. Runs socket-cli scan against the lockfile and fails the job on configurable severity. Configuration in .socket.json:
    { "issueRules": { "critical": "error", "high": "warn", "medium": "ignore" } }
    
    Default: critical → block the PR; lower severities → comment only.
  • Sandcastle reviewer prompt reads Socket CI output via the GitHub CLI and rejects the agent slice when a critical finding is present. Adds machine-readable enforcement to the agent dispatch loop.

Socket adds a 9th hard filter to evaluate-library (ADR-022's filter set). New trace frontmatter field:

filter-results:
  socket-risk: clean | flagged | "<finding-summary>"

At adoption time the skill runs socket-cli scan <package> and records the result. The continuous monitor surface is §3 (trace revalidation), which re-runs the same command on schedule.

3. Trace revalidation cron (ADR-022 continuous-validation counterpart)

New workflow at .github/workflows/trace-revalidation-weekly.yml. Runs weekly via cron + on-demand via workflow_dispatch. Mirrors the cadence shape of mutation-nightly.yml.

Scope: every approved + pre-shipped trace under docs/library-decisions/. Rejection traces skipped (no signal value in re-validating an already-rejected library).

Action — for each in-scope trace:

  1. Read the trace's verification-commands: block.
  2. Re-run each command, capture stdout/stderr.
  3. Compare against the trace's filter-results: snapshot.
  4. Classify divergence:
    • Soft — CVE count changed without crossing severity threshold; maintenance signal still active but downgraded one level; transitive dep count changed.
    • Hard — license changed; named-consumer no longer present; critical CVE disclosed; EU residency flipped to fail; Socket flag escalated to critical.

Issue management:

  • Soft divergence appends to a single rolling "library-trace dashboard" GitHub issue kept open continuously. One issue total, updated each run with the latest comparison diff. Labeled library-policy/dashboard.
  • Hard divergence opens a fresh per-dep GitHub issue labeled library-policy/re-evaluation. Title format: [trace-revalidation] <package> — <reason>. The issue body cites the trace path + the verification-command output + the diff.

Auto-edit policy: trace revalidation NEVER edits a trace file. The re-walk needs the evaluate-library skill (8 filters + 3 prompts, with agent judgement). CI catches divergence; the dispatch loop fixes it.

Auto-dispatch policy: library-policy/re-evaluation issues are not auto-picked up by the dispatch loop. Human triage required. Auto-dispatch on CI-opened issues would create a feedback loop where the agent loop spends nights re-evaluating libraries based on rotating CVE data. Issues are a queue; humans drain them via pnpm work dispatch.

Main-CI gating policy: hard divergence does NOT fail CI on main. Main can keep deploying while the trace gets re-walked. Gating on main would block release-please PRs every time a CVE drops upstream.

4. Baseline GitHub-native gates

  • CodeQL at .github/workflows/codeql.yml. Language config javascript-typescript covers everything this repo is. Runs on push to main + PRs + weekly schedule. Free for public repos and on Pro/Team/Enterprise plans for private repos; consumers without a CodeQL-eligible plan get a no-op + a clear error message from GitHub.
  • pnpm audit signatures --audit-level=high added as one step in ci.yml's existing validate job. Verifies npm sigstore attestations. ~40% of the registry is signed today and climbing.
  • Secret scanning — two layers:
    • GitHub-native push protection (server-side, free, blocks pushes containing known token patterns at the GitHub edge). Consumer toggles in repo settings. Documented in docs/guides/ci-security.md.
    • gitleaks pre-commit hook wired into .husky/pre-commit as a step alongside the existing state-sync guard. Catches custom token patterns the GitHub allowlist doesn't know about. Local; free.

5. Failure-mode hierarchy

Two principles govern what blocks vs. what comments:

  • Boolean checks (compiles, schema valid, signature verifies, secret present, trace file present) hard-block. They have a definite right answer.
  • Judgment checks (Socket risk score, CodeQL semantic finding) are advisory unless severity reaches critical / error. They can have false positives.

Concrete table (the source of truth referenced by reviewer-prompt and documentation):

Gate Layer Hard block?
pnpm typecheck && test && lint && conformance && coverage:diff CI Yes
State-sync guard pre-commit Yes
gitleaks (custom patterns) pre-commit Yes
Library-trace presence check (ADR-022) pre-commit Yes
GitHub native push protection server-side Yes (GitHub edge)
Renovate minor/patch bump PRs CI Auto-merge if green
Renovate major bump PRs CI Block until evaluate-library re-run + trace refresh
Socket CI step — critical CI Yes
Socket CI step — high or below CI Advisory
Socket GitHub App PR comments server-side Advisory
CodeQL — error severity CI Yes
CodeQL — warning / note CI Advisory
pnpm audit signatures failure CI Yes
GitHub Dependabot vuln alerts server-side Advisory (post-merge)
Trace revalidation — soft divergence weekly cron Dashboard issue
Trace revalidation — hard divergence weekly cron Per-dep issue

6. Amendments to ADR-022

This ADR amends ADR-022 in three places. ADR-022 itself stays unedited (its Status: Accepted is preserved for provenance); the amendments are recorded here and the new behavior is what the implementation honors.

§6.1 — Major-bump re-evaluation trigger. ADR-022 §1 and §8 spoke of "new runtime dependencies" but did not address bumps to existing deps. When Renovate (§1 above) bumps a runtime dep in a feature- or core-tier package and the bump crosses a semver-major boundary, the evaluate-library skill re-runs against the upgraded version. Minor + patch bumps do not trigger re-evaluation. The existing trace file is updated in-place: version, filter-results, verification-commands, and last-revalidated (new field — see §6.2) are refreshed; the original date field is preserved as the adoption-provenance marker.

§6.2 — last-revalidated frontmatter field. Trace schema gains last-revalidated: <YYYY-MM-DD>, set by both major-bump re-eval (§6.1) and trace revalidation (§3). Separate from the original date field which is immutable post-adoption.

§6.3 — Socket as 9th hard filter. ADR-022's 8 hard filters gain a 9th: socket-risk. Trace frontmatter's filter-results: block adds socket-risk: clean | flagged | "<finding-summary>". At adoption time the evaluate-library skill runs socket-cli scan <package> as part of the cheap-structural filter block; critical findings auto-reject. Verification-commands gains the Socket scan command.

7. Composition with the sandcastle reviewer prompt

The reviewer prompt at .sandcastle/reviewer.prompt.md is extended with two new responsibilities (bundled into the library-evaluation epic's existing story 06, not split into a separate story):

  • Read Socket CI output (via gh run view or PR API) and reject the slice if any critical finding is present.
  • Read CodeQL findings and reject the slice if any error severity is present.

These compose with the reviewer's existing responsibilities (library- trace presence check from ADR-022's PRD, pnpm coverage:diff from ADR-020).

8. Template-vs-consumer framing

This stack ships as template artifacts that work in any consumer's GitHub repo. Configurations (renovate.json, .socket.json, codeql.yml, trace-revalidation-weekly.yml) are written generically:

  • No project-name-specific paths.
  • All workflows use ubuntu-latest.
  • Plan-gated tools (CodeQL on private repos) include a clear error message when the consumer's plan doesn't cover them, rather than no-op-ing silently.
  • docs/guides/ci-security.md documents what each consumer toggles (GitHub push protection, Socket App install, branch protection rules for library-policy/re-evaluation labels).

This template's own consumption of the stack — when it's eventually pushed to a GitHub remote — uses the same configurations unchanged.

9. Amendment — SBOM release artifact (CycloneDX)

Added: 2026-05-20 (story 10-sbom-ci-workflow)

.github/workflows/release-please.yml is amended to generate a CycloneDX SBOM and attach it to every GitHub release cut by release-please.

Concrete step shape:

- name: Generate CycloneDX SBOM
  if: ${{ steps.release.outputs.releases_created == 'true' }}
  run: >
    pnpm dlx @cyclonedx/cyclonedx-npm
    --output-file sbom-${{ steps.release.outputs.tag_name }}.cdx.json
    --output-format json
    --ignore-npm-errors

- name: Attach SBOM to GitHub release
  if: ${{ steps.release.outputs.releases_created == 'true' }}
  uses: softprops/action-gh-release@b4309332981a82ec1c5618f44dd2e27cc8bfbfda # v3.0.0
  with:
    tag_name: ${{ steps.release.outputs.tag_name }}
    files: sbom-${{ steps.release.outputs.tag_name }}.cdx.json

Prerequisites (also conditional on releases_created == 'true'): actions/checkout@v4pnpm/action-setup@v4actions/setup-node@v4pnpm install --frozen-lockfile, providing the installed workspace graph that @cyclonedx/cyclonedx-npm analyses.

Rationale:

  • Compliance surface. Consumers pursuing SOC 2 / ISO 27001 / FedRAMP / EU CRA must produce an inventory of every version that shipped. A CycloneDX JSON SBOM attached to each GitHub release gives auditors a machine-readable, per-release artifact without requiring them to inspect or re-run the repo.
  • pnpm dlx, not package.json. @cyclonedx/cyclonedx-npm is a release-time audit tool, not a runtime or build dependency; adding it to the lockfile would violate ADR-022's spirit (library evaluation required for runtime deps in feature/core packages). pnpm dlx fetches and discards it within the CI step.
  • --ignore-npm-errors. @cyclonedx/cyclonedx-npm internally invokes npm ls to traverse the dependency graph. In a pnpm workspace, npm ls exits non-zero when it encounters dev-deps-of- dev-deps that pnpm correctly elides from the install tree; the SBOM content is unaffected. Without this flag the step exits 254 and produces no file. --ignore-npm-errors instructs the tool to treat those npm ls warnings as non-fatal and emit the SBOM regardless.
  • SHA-pinned action. softprops/action-gh-release is pinned to a 40-character commit SHA (# v3.0.0 trailing comment) per §1's Renovate pinGitHubActionDigests preset. Renovate will open a bump PR when a newer release is available.
  • Conditional execution. The SBOM steps run only when releases_created == 'true' — every non-release push to main skips them entirely. This keeps the workflow fast for the common case (release-please just updating its rolling PR).
  • Root SBOM covers all workspace packages. Running @cyclonedx/cyclonedx-npm from the workspace root after pnpm install --frozen-lockfile captures the full resolved dependency graph across all packages. Per-package SBOMs are out of scope (see story 10-sbom-ci-workflow §Out of scope).

Failure-mode table row (extends §5):

Gate Layer Hard block?
SBOM generation (@cyclonedx/...) release Yes — release job fails
SBOM upload (softprops/action-gh-release) release Yes — release job fails

SBOM absence blocks the release job; it does not gate main CI (the ci.yml validate job is unaffected). This matches the principle that release assets are part of the release job, not part of the per-PR validation loop.

Alternatives considered

  • Dependabot for everything instead of Renovate. Rejected. Less granular monorepo handling, requires verbose per-workspace config, Action SHA pinning needs manual setup. Renovate's pnpm-workspace
    • pinGitHubActionDigests presets do this declaratively.
  • No bump tool, manual bumps only. Rejected. Deps go stale; CVE patches land late; the named-consumer-cares-now signal becomes "who even remembers."
  • Paid Socket Team plan for hard PR blocks. Rejected (default). Free App + self-hosted socket-cli in CI achieves equivalent enforcement at $0. Consumers who want the server-side branch-protection integration can upgrade per their own threat model.
  • Nightly trace revalidation instead of weekly. Rejected. License / maintenance / EU-residency signals don't change daily; nightly burns CI minutes on noise. CVE batches publish ~weekly.
  • Auto-dispatch on library-policy/re-evaluation issues. Rejected. Creates a feedback loop where the agent loop runs nightly re-evals on rotating CVE data. The issue queue stays human-triaged; the dispatch loop drains it on demand.
  • CI gating on library-policy/re-evaluation (block main). Rejected. Main can keep deploying while the trace gets re-walked. CI gating would block release-please PRs every time a CVE drops upstream, conflating release flow with policy maintenance.
  • Splitting into two ADRs (CI security + ADR-022 extensions). Rejected. The bump-trigger rule only makes sense once Renovate is in place; trace revalidation only makes sense alongside Socket; the failure-mode hierarchy spans both. One coherent decision, one ADR.
  • Editing ADR-022 in-place to add the bump rule + Socket filter + new field. Rejected. ADR-022's Status: Accepted is provenance — what we believed when it was signed. Amendments live here in §6 and the implementation honors the composed picture. This is the repo's first amendment-style ADR; if it works, future ADR drift gets the same pattern.

Consequences

Positive

  • Closes the post-adoption rot ADR-022 cannot reach alone — CVE drift, supply-chain behavior compromise, license drift, EU-residency drift, Action supply-chain attacks.
  • The trace artifact becomes continuously validated, not point-in-time. Approved libraries carry a last-revalidated freshness signal.
  • Renovate's major/minor split + automerge keeps the lockfile current with minimal human attention while still gating real decision moments.
  • Action SHA pinning closes the tj-actions/changed-files class of attack permanently.
  • Layered Socket integration (App + CLI + reviewer prompt) gives consumers a $0 baseline that's stricter than most paid offerings' defaults.
  • Failure-mode hierarchy is machine-readable: the sandcastle reviewer prompt becomes the single composable gate for agent-driven PRs.
  • Template-vs-consumer framing means downstream repos inherit the stack on day one without per-project setup.

Negative

  • Six new artifacts ship (renovate.json, .socket.json, codeql.yml, trace-revalidation-weekly.yml, gitleaks pre-commit step, updates to ci.yml + reviewer prompt). Each is a maintenance surface.
  • Renovate config is dense; consumers extending it past defaults need to learn its rules.
  • Socket GitHub App requires a per-consumer install (one click); the CLI step in CI works regardless.
  • Trace revalidation produces a steady stream of dashboard-issue updates that humans/agents must skim periodically. Most are no-action.
  • ADR-022 + ADR-023 together are the repo's first amendment chain. Future agents must read both to get the current policy picture.
  • Major-bump trigger means every semver-major Renovate PR blocks on agent walk-through of evaluate-library (~5 min agent work per major-bump per package).

Neutral

  • The 6 amendments to ADR-022 don't change ADR-022's Status: Accepted. Future archaeology finds both ADRs and the implementation honors the composed picture.
  • Existing 5 workflows are untouched except ci.yml gaining 1 step (pnpm audit signatures) and 1 step (socket-cli scan).
  • Glossary entries for Trace revalidation and Major-bump re-evaluation landed inline during the grill session that produced this ADR.
  • ADR-022 — Library evaluation policy (the foundation this builds on and amends in §6)
  • ADR-019 — Sandcastle agent orchestration (the reviewer prompt is the agent-loop enforcement surface, see §7)
  • ADR-021 — release-please versioning (Renovate's bump PRs interact with release-please's release PRs; both are managed automatically)
  • ADR-012 — Feature conventions (the conformance system shape this stack mirrors — layered enforcement at declining latencies)
  • ADR-017 — OpenTelemetry migration (vendor-isolation pattern; Socket integration follows the same shape — core-shared doesn't import Socket SDK)
  • Glossary entries for Library trace, Pre-shipped trace, Trace revalidation, Major-bump re-evaluation
  • PRD: docs/work/prds/ci-security-and-supply-chain.prd.md (to be written, materialized via /to-prd)
  • Companion guide: docs/guides/ci-security.md (to be written; human reading-room with worked examples)