Files
agentic-dev/docs/decisions/adr-017-opentelemetry-migration.md

5.5 KiB
Raw Blame History

ADR-017 — OpenTelemetry Migration

Status: Accepted Date: 2026-05-11 Spec: docs/superpowers/specs/2026-05-11-opentelemetry-migration-design.md Plan: docs/superpowers/plans/2026-05-11-opentelemetry-migration.md Supersedes (impl section): ADR-014

Context

ADR-014 established vendor-neutral ITracer + ILogger interfaces with Sentry as the active backend. The interface decisions (R31R51) have held up; what coupled to a vendor was the substrate: SentryTracer and SentryLogger called Sentry SDK methods directly. Swapping vendors required rewriting every *Tracer/*Logger pair.

This ADR migrates the substrate to OpenTelemetry: code emits OTel spans, logs, and metrics; exporters route to one or more backends. Sentry is wired as the (initially only) exporter via @sentry/opentelemetry. Swapping vendors becomes an exporter swap.

Decision

  1. OTel SDK as substrate. Server-side ITracer and ILogger impls use @opentelemetry/api and @opentelemetry/api-logs respectively. New IMetrics signal added via OTel metrics API.
  2. Sentry-as-exporter. @sentry/opentelemetry provides SentrySpanProcessor + SentryLogRecordProcessor. They consume OTel signals and forward to Sentry. Sentry's UI experience is preserved (minus some browser-side richness, addressed below).
  3. Server-only scope. Browser keeps Sentry SDK directly. Replay + session-error correlation stay native. Future spec extends OTel to browser when warranted.
  4. Pure OTel Logs API for the logger. OtelLogger emits via @opentelemetry/api-logs. Trade-off: slightly degraded Sentry-native error UX (stack normalization, breadcrumb buffer) in exchange for swap-by-exporter vendor neutrality.
  5. Breadcrumbs → span events. ILogger.addBreadcrumb attaches to the active OTel span as an event. Native OTel pattern.
  6. setUser per-span. Sets user.id as a span attribute on the active span. R36 preserved (id only; no email/username).
  7. PII scrubbing migrated. From Sentry's beforeSend/beforeSendTransaction hooks to OTel SpanProcessor + LogRecordProcessor impls (PiiScrubSpanProcessor, PiiScrubLogRecordProcessor). Processors run BEFORE the Sentry exporter, so PII is stripped at the OTel layer regardless of downstream exporter. Browser init files (init-client.ts, init-client-react.ts) retain beforeSend/beforeSendTransaction hooks because they do not use the OTel pipeline.
  8. R52 new ESLint rule. @opentelemetry/sdk-*, @opentelemetry/exporter-*, @opentelemetry/instrumentation-*, @opentelemetry/resources, @opentelemetry/semantic-conventions restricted to **/instrumentation/otel/** and app init paths. @opentelemetry/api and @opentelemetry/api-logs are unrestricted within core-shared/instrumentation/.
  9. bindSentryInstrumentation renamed to bindOtelInstrumentation with a deprecation alias for one release cycle.
  10. IMetrics synchronous-only. Three methods: counter, histogram, gauge. gauge uses UpDownCounter under the hood; true "set" gauge semantics require an ObservableGauge with a periodic callback, deferred to a v2 metrics interface.
  11. Auto-instrumentations enabled. HTTP (@opentelemetry/instrumentation-http), undici (instrumentation-undici), pg (instrumentation-pg) registered in initOtelServerNode. HTTP instrumentation strips query strings from http.url.path attribute and ignores /_health and /_otel-export paths. PgInstrumentation has enhancedDatabaseReporting: false to avoid SQL statement capture (R32 — SQL often contains PII in WHERE clauses).
  12. no-sentry.tsno-instrumentation.ts in core-testing. Renamed with backward-compat alias for one release. Mocks both Sentry SDK and OTel SDK modules to prevent real init in vitest runs.

Alternatives considered

  • Keep Sentry SDK directly. Rejected — couples impl to Sentry forever.
  • OTel SDK + keep Sentry-direct for captureException. Rejected — partial vendor swap re-introduces lock-in for the error path.
  • Migrate browser too. Rejected — OTel-Browser maturity in 2026 is good for traces but Sentry's browser SDK has features (replay, native error correlation) that don't yet have OTel equivalents.
  • Put PII scrub in Sentry exporter config. Rejected — beforeSend hooks run inside the Sentry SDK after OTel signals are converted; the OTel processor layer is earlier and vendor-agnostic. Scrubbing at the processor layer means any future exporter added alongside Sentry also sees clean data.

Consequences

Positive:

  • Vendor swaps are exporter swaps. Adding Honeycomb / Datadog / Grafana Cloud / Tempo is just adding their exporter alongside Sentry's.
  • Auto-instrumentations (HTTP, undici, pg) reduce manual span boilerplate.
  • New IMetrics signal available; metrics call sites can land per-feature opportunistically.
  • PII scrubbing is vendor-neutral — applies before any exporter sees the data.

Negative:

  • Sentry-native error UX is slightly degraded (errors arrive as OTel log records instead of native Sentry events). Acceptable per vendor-neutrality goal.
  • Breadcrumb semantics shift from buffered cross-span to per-span events. Acceptable.
  • Browser is still Sentry-direct — observability stack is asymmetric server vs. browser until a future browser migration.
  • OTel SDK adds dependency surface (~12 new packages in core-shared).

Relationship to ADR-014

ADR-014's interface decisions (R31R51) remain authoritative. This ADR supersedes only the implementation section (Sentry SDK direct calls → OTel SDK). ADR-014 keeps a "Status: Superseded for impl by ADR-017" header.