Files
telemetry/docs/adr/ADR-002-three-signal-otlp-bootstrap.md
Rene Nochebuena ed4e9ef161 feat(telemetry): initial stable release v0.9.0
Single-call OTel SDK bootstrap setting all three global providers (traces → Tempo, metrics → Mimir, logs → Loki) over OTLP gRPC.

What's included:
- New(ctx, Config): bootstraps TracerProvider, MeterProvider, and LoggerProvider with OTLP gRPC exporters; sets OTel globals
- W3C TraceContext + Baggage propagation set globally
- Resource tagging: service.name, service.version, deployment.environment merged with SDK defaults
- OTLPInsecure bool for development environments without TLS
- Sequential rollback on partial initialization failure — no dangling exporters on error
- Returns shutdown func(context.Context) error; caller defers in main or wires into launcher BeforeStop
- Tier 5 module: must be imported only by application main packages; zero micro-lib dependencies

Tested-via: todo-api POC integration
Reviewed-against: docs/adr/
2026-03-18 14:13:29 -06:00

2.8 KiB

ADR-002: Three-Signal OTLP gRPC Bootstrap

Status: Accepted Date: 2026-03-18

Context

OpenTelemetry defines three observability signals:

  • Traces — distributed trace spans (latency, call graphs)
  • Metrics — counters, gauges, histograms
  • Logs — structured log records correlated with trace context

The target observability stack is the Grafana LGTM stack: Loki (logs), Grafana (dashboards), Tempo (traces), Mimir (metrics), fronted by Grafana Alloy as the OTLP collector/router.

The question is what to bootstrap and how to transport signals to the collector. Options include:

  • Bootstrap only traces (the most common starting point), add others later.
  • Bootstrap all three signals in one call, using a shared OTLP gRPC endpoint.
  • Use per-signal configuration with separate endpoints.

Decision

telemetry.New(ctx, cfg) bootstraps all three signals in a single call using a shared OTLP gRPC endpoint (cfg.OTLPEndpoint, e.g. "alloy:4317"):

  1. TracerProvidersdktrace.NewTracerProvider with an OTLP gRPC batch exporter; W3C TraceContext + Baggage propagation set globally via otel.SetTextMapPropagator.
  2. MeterProvidersdkmetric.NewMeterProvider with an OTLP gRPC periodic reader.
  3. LoggerProvidersdklog.NewLoggerProvider with an OTLP gRPC batch processor.

All three providers share one *resource.Resource built from cfg.ServiceName, cfg.ServiceVersion, and cfg.Environment (merged with the OTel default resource which contributes service.instance.id and SDK metadata).

Error handling during bootstrap is sequential and rolls back already-created providers: if metric exporter creation fails, the trace provider is shut down before returning the error; if log exporter creation fails, both trace and metric providers are shut down.

The returned shutdown function joins the shutdown of all three providers with errors.Join, so a single defer shutdown(ctx) flushes and closes all exporters.

Consequences

  • One Config struct covers all three signals. Per-signal endpoint overrides are not supported in the current design. If per-signal routing is needed, Grafana Alloy handles that at the collector level.
  • OTLPInsecure: true disables TLS on all three signal connections simultaneously. This is the expected setting for local development (Alloy runs on localhost or in the same Docker network).
  • Failing to initialize any one of the three exporters aborts the entire bootstrap. A partially initialized telemetry state (e.g., traces but no metrics) is considered more dangerous than failing fast.
  • The W3C TraceContext propagator is set globally. Applications that need custom propagators (e.g., B3) must call otel.SetTextMapPropagator after telemetry.New to override.
  • All three providers use batch/periodic export. Synchronous export is not available through this bootstrap path.