Files
httpclient/docs/adr/ADR-001-circuit-breaker-and-retry.md
Rene Nochebuena 6026ab8a5e feat(httpclient): initial stable release v0.9.0
Resilient HTTP client with circuit breaking, exponential-backoff retry, X-Request-ID propagation, and a generic typed JSON helper.

What's included:
- Client interface with Do(req) method; New(logger, cfg) and NewWithDefaults(logger) constructors
- Config struct with env-tag support for timeout, dial timeout, retry, and circuit breaker parameters
- Retry via avast/retry-go/v4 with BackOffDelay; triggers only on network errors and HTTP 5xx
- Circuit breaker via sony/gobreaker wrapping the full retry loop; open circuit → xerrors.ErrUnavailable
- X-Request-ID header propagated automatically from context via logz.GetRequestID on every attempt
- DoJSON[T](ctx, client, req) generic helper for typed JSON request/response with xerrors error mapping
- MapStatusToError(code, msg) exported function mapping HTTP status codes to xerrors types

Tested-via: todo-api POC integration
Reviewed-against: docs/adr/
2026-03-19 13:04:37 +00:00

2.8 KiB

ADR-001: Circuit Breaker and Retry via gobreaker and avast/retry-go

Status: Accepted Date: 2026-03-18

Context

Outbound HTTP calls to external services are subject to transient failures (network blips, brief service restarts) and sustained failures (outages, overloads). Two complementary strategies address these cases:

  • Retry recovers from transient failures by re-attempting the request a limited number of times before giving up.
  • Circuit breaking detects sustained failure patterns and stops sending requests to a failing service, giving it time to recover and preventing the caller from accumulating blocked goroutines.

Implementing both from scratch introduces risk of subtle bugs (backoff arithmetic, state machine transitions). Well-tested, widely adopted libraries are preferable.

Decision

Two external libraries are composed:

Retry: github.com/avast/retry-go/v4

  • Configured via Config.MaxRetries and Config.RetryDelay.
  • Uses retry.BackOffDelay (exponential backoff) to avoid hammering a failing service.
  • retry.LastErrorOnly(true) ensures only the final error from the retry loop is reported.
  • Only HTTP 5xx responses trigger a retry. 4xx responses are not retried (they represent caller errors, not server instability).

Circuit breaker: github.com/sony/gobreaker

  • Configured via Config.CBThreshold (consecutive failures to trip) and Config.CBTimeout (time in open state before transitioning to half-open).
  • The retry loop runs inside the circuit breaker's Execute call. A full retry sequence counts as one attempt from the circuit breaker's perspective only if all retries fail.
  • When the circuit opens, Do returns xerrors.ErrUnavailable immediately, without attempting the network call.
  • State changes are logged via the duck-typed Logger interface.

The nesting order (circuit breaker wraps retry) is intentional: the circuit breaker accumulates failures at the level of "did the request ultimately succeed after retries", not at the level of individual attempts.

Consequences

Positive:

  • Transient failures are handled transparently by the caller.
  • Sustained outages are detected quickly and the circuit opens, returning fast errors.
  • Configuration is explicit and environment-variable driven.
  • Circuit state changes are observable via logs.

Negative:

  • Retry with backoff increases total latency for failing requests up to MaxRetries * RetryDelay * (2^MaxRetries - 1) in the worst case.
  • The circuit breaker counts only consecutive failures (ConsecutiveFailures >= CBThreshold), not a rolling failure rate. Interleaved successes reset the counter.
  • gobreaker.ErrOpenState is wrapped in xerrors.ErrUnavailable, so callers must check for this specific code to distinguish circuit-open from normal 503 responses.