v1.0.0

code.nochebuena.dev/go/health

Overview

health v1.0.0 commits the HTTP health check API as stable. All v0.9.0 roadmap
items are resolved. The module ships a parallel check handler with two-level
criticality, a configurable check timeout, and a pure stdlib footprint. No
external or internal micro-lib imports required.

What Changed Since v0.9.0

New: `Config` and `NewHandlerWithConfig`

type Config struct {
    CheckTimeout time.Duration // 0 → 5s default
}

func NewHandlerWithConfig(logger Logger, cfg Config, checks ...Checkable) http.Handler

The check timeout is now configurable. NewHandler remains unchanged and
continues to use the 5-second default.

// Default — identical to v0.9.0
h := health.NewHandler(logger, db, cache)

// Custom 2-second timeout for SLO-sensitive probes
h := health.NewHandlerWithConfig(logger, health.Config{CheckTimeout: 2 * time.Second}, db, cache)

Roadmap items resolved

Item	Resolution
Configurable check timeout	✅ `Config.CheckTimeout` via `NewHandlerWithConfig`
Result caching (TTL-based)	❌ No — load balancer probe spacing controls frequency
Timeout vs error distinction in JSON	❌ No — the `error` field is sufficient for current use
Buffer channel goroutine-leak documentation	✅ Validated — checks are fixed at construction time (correct by design)
Parallel checks under probe traffic	✅ Validated in production

Full API (stable)

Level — LevelCritical (0, default) and LevelDegraded (1).

Checkable — HealthCheck(ctx) error, Name() string, Priority() Level.

Logger — duck-typed interface satisfied by logz.Logger with no import required.

ComponentStatus — { status, latency, error } JSON fields.

Response — { status, components } JSON fields.

Config — CheckTimeout time.Duration; zero value → 5 seconds.

NewHandler(logger, checks...) http.Handler — backward-compatible constructor, 5s timeout.

NewHandlerWithConfig(logger, cfg, checks...) http.Handler — constructor with explicit config.

Migration from v0.9.0

No breaking changes. NewHandler is unchanged.

go get code.nochebuena.dev/go/health@v1.0.0

Downloads

v0.9.0 e1b6b7ddd7

Compare
Release v0.9.0 Stable

Rene Nochebuena released this 2026-03-18 14:07:12 -06:00 | 2 commits to main since this release
v0.9.0

code.nochebuena.dev/go/health

Overview

health provides a single http.Handler that interrogates any number of registered infrastructure components concurrently and returns a structured JSON response with per-component status and an overall service status. It is designed to be mounted at /health and consumed by load balancers, container orchestrators, and uptime monitors. The two-level criticality model (LevelCritical / LevelDegraded) allows a service to report partial availability — degraded but still serving — rather than forcing a binary UP/DOWN distinction.

What's Included
- Level type — int representing component criticality; LevelCritical (value 0, default) and LevelDegraded (value 1)
- Checkable interface — HealthCheck(ctx context.Context) error, Name() string, Priority() Level
- Logger interface — duck-typed minimal logger satisfied by logz.Logger; defined locally so logz is not imported
- ComponentStatus struct — JSON-serialisable per-component result with status, latency, and error fields
- Response struct — JSON-serialisable overall response with status and components map
- NewHandler(logger Logger, checks ...Checkable) http.Handler — constructs the health handler; runs all checks concurrently with a 5 s timeout derived from the request context
Installation
```
require code.nochebuena.dev/go/health v0.9.0
```
Design Highlights
- All registered checks run in parallel goroutines; results are collected via a buffered channel sized to the number of checks, preventing goroutine leaks if the handler returns early (see docs/adr/ADR-001-parallel-checks.md).
- The two-level criticality model means a LevelDegraded component failure produces HTTP 200 with "status":"DEGRADED", while a LevelCritical failure produces HTTP 503 with "status":"DOWN", giving orchestrators and load balancers a clean binary signal while still surfacing partial degradation to monitoring (see docs/adr/ADR-002-two-level-criticality.md).
- Checkable is defined in this package and implemented by infrastructure components — the dependency flows one way: infra → health (see docs/adr/ADR-003-checkable-interface.md).
- The Logger interface is declared locally as a duck-typed subset of logz.Logger, so health has no micro-lib imports and remains a pure stdlib package.
Known Limitations & Edge Cases
- The 5 s check timeout is hardcoded as context.WithTimeout(r.Context(), 5*time.Second). It is not configurable via NewHandler options or per-check. A check that consistently takes close to 5 s will produce noisy latency values in the response.
- Check results are never cached. Every HTTP request to the health endpoint triggers a live round of all checks. High-frequency polling (e.g. load balancer health probes every 1 s) will produce one full set of DB/network round-trips per probe.
- If the request context is cancelled before the 5 s deadline (e.g. the client disconnects), checks that have not yet completed will see a cancelled context, but goroutines already dispatched will run to their natural completion or until ctx.Done() fires — there is no forced goroutine cancellation.
- NewHandler called with a nil logger will panic on the first request when it calls logger.WithContext. Callers must supply a non-nil logger.
- The zero value of Level is LevelCritical. Forgetting to set Priority() on a Checkable implementation defaults to critical — a silent misconfiguration for components intended to be degraded-only.
- There is no distinction in the JSON response between a check that timed out and one that returned an explicit error — both are represented as a non-empty error field in ComponentStatus.
v0.9.0 → v1.0.0 Roadmap
- Make the check timeout configurable via a NewHandler option (e.g. Options{CheckTimeout time.Duration}), with 5 s as the default.
- Add optional result caching (TTL-based) to prevent every load-balancer probe from generating live infrastructure round-trips.
- Distinguish timeout errors from explicit check errors in the ComponentStatus JSON, so monitoring dashboards can differentiate slow checks from failing ones.
- Validate that the buffered channel goroutine-leak prevention holds correctly when the check count changes dynamically (it does not currently — checks are fixed at construction time, which is correct, but should be documented explicitly).
- Achieve production validation of the parallel check path under concurrent load balancer probe traffic.
v0.9.0 rationale: The API is stable and intentional — designed through multiple architecture reviews and tested end-to-end via the todo-api POC (SQLite, RBAC, middleware stack, HTTP handlers). The module is not yet battle-tested in production for all edge cases, and the pre-1.0 designation preserves the option for minor API refinements based on real-world use.
Downloads
- Source Code (ZIP)
- Source Code (TAR.GZ)

2 Releases 3 Tags

Release v1.0.0 Stable

v1.0.0

Overview

What Changed Since v0.9.0

New: Config and NewHandlerWithConfig

Roadmap items resolved

Full API (stable)

Migration from v0.9.0

Release v0.9.0 Stable

v0.9.0

Overview

What's Included

Installation

Design Highlights

Known Limitations & Edge Cases

v0.9.0 → v1.0.0 Roadmap

New: `Config` and `NewHandlerWithConfig`