docs(worker): correct tier from 2 to 3 and fix dependency tier refs
worker depends on launcher (now correctly Tier 2) and logz (Tier 1), placing it at Tier 3. The previous docs cited launcher as Tier 1 and logz as Tier 0, both of which were wrong.
This commit is contained in:
46
docs/adr/ADR-002-per-task-timeout.md
Normal file
46
docs/adr/ADR-002-per-task-timeout.md
Normal file
@@ -0,0 +1,46 @@
|
||||
# ADR-002: Per-Task Timeout via Child Context
|
||||
|
||||
**Status:** Accepted
|
||||
**Date:** 2026-03-18
|
||||
|
||||
## Context
|
||||
|
||||
Worker tasks can call external services, run database queries, or perform other
|
||||
operations with unpredictable latency. A single slow or hung task occupying a
|
||||
goroutine indefinitely degrades overall pool throughput. Without a bounded
|
||||
execution time, one bad task can block a worker slot for the lifetime of the
|
||||
process.
|
||||
|
||||
At the same time, a blanket timeout should not be imposed when callers have not
|
||||
requested one — zero-timeout (polling or batch jobs) is a legitimate use case.
|
||||
|
||||
## Decision
|
||||
|
||||
`Config` exposes a `TaskTimeout time.Duration` field (env `WORKER_TASK_TIMEOUT`,
|
||||
default `0s`). Each worker goroutine checks this value before calling a task:
|
||||
|
||||
- If `TaskTimeout > 0`, a `context.WithTimeout(ctx, w.cfg.TaskTimeout)` child
|
||||
context is created and its `cancel` function is deferred after the call.
|
||||
- If `TaskTimeout == 0`, the pool root context is passed through unchanged and a
|
||||
no-op cancel function is used.
|
||||
|
||||
The task receives the (possibly deadline-bearing) context as its only `context.Context`
|
||||
argument. It is the task's responsibility to respect cancellation; the pool does not
|
||||
forcibly terminate goroutines.
|
||||
|
||||
`cancel()` is called immediately after the task returns, regardless of whether the
|
||||
task succeeded or failed, to release the timer resource promptly.
|
||||
|
||||
## Consequences
|
||||
|
||||
- Tasks that respect `ctx.Done()` or pass `ctx` to downstream calls are automatically
|
||||
bounded by `TaskTimeout`.
|
||||
- Tasks that ignore their context will not be forcibly killed; the timeout becomes a
|
||||
best-effort signal only. This is a deliberate trade-off — Go does not support
|
||||
goroutine preemption.
|
||||
- Setting `TaskTimeout = 0` is a safe default: no deadline is added, and no timer
|
||||
resource is allocated per task.
|
||||
- `TaskTimeout` is independent of `ShutdownTimeout`. A task may have a 5-second
|
||||
execution timeout while the pool allows 30 seconds to drain during shutdown.
|
||||
- The timeout context is a child of the pool root context, so cancelling the pool
|
||||
(via `OnStop`) also cancels any running task context, regardless of `TaskTimeout`.
|
||||
Reference in New Issue
Block a user