This document logs significant architectural decisions for the lifecycle project.
- Status: Accepted
- Context: A common problem in Go/Docker environments is "Zombie Processes" — child processes that outlive their parents because the parent crashed or failed to signal them. This leads to resource leaks and operational headaches.
- Decision:
lifecycledelegates low-level process guarantees to theprociolibrary. We use platform-specific mechanisms (Linux PDeathSig, Windows Job Objects) to guarantee that if the parent dies, the children die. - Consequences: This behavior is enabled by default in
pkg/supervisor(viaprocio/proc). It is effectively non-negotiable for the library's identity.
- Status: Accepted
- Context: Should the library automatically handle
SIGINT(Ctrl+C) andSIGTERM? - Decision: Yes, by default (Imperial Default).
- Rationale:
- Safety: Prevents beginners from creating unkillable processes.
- Standards:
SIGTERMcompliance is mandatory for Kubernetes/Docker. - Expectation: For most Services and CLIs,
SIGINTmeans "Stop", not "Clear line".
- Exception: Interactive Shells/REPLs. In these specific cases, developers MUST explicitly disable global handling (
signal.WithForceExit(0)) and handle signals locally to avoid killing the session onCtrl+C.
- Status: Accepted
- Context: Goroutine leaks occur when developers forget to
Wait()on aWaitGroupor fail to propagate cancellation. - Decision:
lifecycle.Go(ctx, fn)automatically tracks goroutines.lifecycle.Runwaits for all tracked goroutines to finish before returning. - Implementation Note: Since ADR-0006, this is powered by context value discovery, ensuring it works even when the context is wrapped by telemetry/middle-tier providers.
- Consequences: Zero configuration required for safe concurrency.
- Status: Accepted
- Context: As the library evolves from "Death Management" to "Lifecycle Management", we need to handle non-terminal events (Reload, Suspend).
- Decision: Adopt an Event-Driven Architecture. Decouple Sources (Signals, Webhooks, Tickers) from Handlers via a standardized
Router. - Consequences: Allows for infinite extensibility without polluting the core
Runloop. - Note: Originally planned for a "v2.0" major version, this was released as v1.5 to avoid
go.modmigration overhead. See MIGRATION.md for breaking changes.
- Status: Accepted
- Context: Manual use of locks in workers generated risks of double unlocks, deadlocks, and repetitive code.
- Decision: Standardize the use of the
withLockandwithLockResulthelpers for all concurrent state manipulation in workers. - Exception: Methods that already perform locking internally (e.g.,
ExportState) should not be wrapped by these helpers. - Consequences: Safer, more readable, and easier-to-maintain code. Reduction of concurrency bugs.
- Reference: Details and examples in TECHNICAL.md.
- Status: Accepted
- Context: Setting up a robust interactive CLI (Standard signals + detached Stdin reader + common commands) requires significant boilerplate (~50 lines of wiring).
- Decision: Provide a
NewInteractiveRouterpreset that encapsulates standard source wiring (OS Signals, Input) and standard command routing (q/quit/suspend/resume). - Rationale: Drastically improves Developer Experience (DX) and ensures consistency across tools in the ecosystem without sacrificing flexibility (configurable via options).
- Status: Accepted
- Context: Application contexts are often wrapped by middle-tier providers (e.g., Task Tracking, Tracing). Simple type assertions to
*signal.Contextfail in these scenarios, breaking core library features likeOnShutdown. - Decision: Implement a Value-Based Discovery Path. Use a private context key to store and retrieve the
signal.Contextpointer. Provide a robustFromContext(ctx)helper that handles both direct pointers and wrapped values. - Consequences: Ensures library resilience when integrated with other heavy-weight frameworks or complex diagnostic wrappers.
- Status: Accepted
- Context: Introspection (Diagrams, Metrics, Logs) needs consistent keys (e.g.,
restarts,circuit_breaker) to provide a unified "Single Pane of Glass" view. Hardcoded strings across packages lead to drift and broken diagrams. - Decision: Standardize metadata keys as typed constants in
pkg/worker. All components (Supervisor, Diagram Engine, Metrics) must use these constants instead of literal strings. - Consequences: Centralizes the introspection "schema", making it trivial to update the visual representation across all interfaces.
- Status: Accepted
- Context: Handlers and Jobs often need to trigger the same graceful termination sequence as an OS Signal (e.g., a "quit" command in a REPL).
- Decision: Provide an explicit
lifecycle.Shutdown(ctx)facade. - Rationale: This abstracts the complex context discovery and cancellation logic, providing a high-level API for internal application control that mirrors external signals.
- Status: Accepted
- Context: Complex state transitions (like
Suspend) often involve multiple actors: workers pausing, state being persisted, and UIs reporting progress. - Decision:
SuspendHandler(and related control plane actors) must execute hooks Sequentially and in FIFO order. - Rationale: This enables a "Final State" reporting pattern. By registering functional components (supervisors, workers) before UI reporting hooks, we guarantee that UI messages like "SYSTEM SUSPENDED" only appear after the heavy components have successfully blocked and confirmed their state.
- Consequences: Developers must be mindful of registration order for UI accuracy. Functional work comes first; reporting comes last.
- Status: Completed (2026-02-13) via
github.com/aretw0/procio - Context: The
lifecyclelibrary evolved into a comprehensive control plane, but its core primitives (Process hygiene, I/O) are valuable optimization layers for any Go program. - Decision: We extracted
proc,termio, andscanintoprocio(Process I/O), a standalone library with zero dependencies.lifecyclenow consumesprocioto provide its high-level guarantees. - Rationale:
- Adoption:
prociosolves universal Go problems (Zombie processes, Windows Stdin) without the framework weight oflifecycle. - Separation of Concerns:
prociohandles "OS Mechanics";lifecyclehandles "Application Policies".
- Adoption:
- Consequences:
pkg/core/procandpkg/core/termiologic now lives inprocio.lifecycleacts as the policy engine driving these primitives.
- Status: Completed (2026-02-15) via
github.com/aretw0/introspection - Context: The
lifecyclelibrary provides runtime introspection viaState()methods and visualizes topology using Mermaid diagrams. Originally, each package (signal,worker,supervisor) contained custom Mermaid string concatenation logic, leading to redundancy, rigidity, and increased testing burden. - Decision: We extracted generic diagram rendering primitives into
introspection, a standalone library.lifecyclenow provides domain-specific styling logic (NodeStyler,PrimaryStyler) and delegates structural rendering (Mermaid syntax, graph traversal) tointrospection. - Rationale:
- DRY Principle: Rendering logic is centralized, not duplicated across multiple packages.
- Reusability: Other projects (e.g.,
trellis,arbour) can useintrospectionfor their own topologies. - Separation of Concerns:
introspectionhandles generic graph rendering;lifecyclehandles domain semantics (status colors, labels). - Maintainability: Visual improvements or Mermaid syntax changes happen in one place.
- Consequences:
- Removed
pkg/core/introspectionpackage (~1500 lines). - Introduced
diagram_config.go(centralized configuration adapter). - Simplified
signal/diagram.goandworker/diagram.goby removing manual fragment rendering functions. lifecyclenow depends ongithub.qkg1.top/aretw0/introspectionv0.1.2+.
- Removed
- Status: Accepted (v1.7.0)
- Context: Sources like
FileWatchSourceneeded to support features like "Debouncing", "Project Awareness" (ignoring.git), and "Synchronous Data Extraction" (Pushing to Go channels instead of relying purely on Router callbacks). - Decision: We keep
Sourcesstructurally dumb and generic, pushing business logic (filtering, debouncing) into the Control Plane viaOptions,Middleware, andBridges. - Rationale:
- Composability: A
DebounceHandlercan be used to throttle any rapid event (likeWebhookSourcebursts), not just file events. If we baked debouncing intoFileWatchSource, we'd have to rewrite it for everything else. - Idiomatic Go: Instead of forcing applications to invert their control flow (callbacks only),
events.Notify(ch)acts as a bridge, allowing consumers to use traditionalselectorfor rangeloops over standardchannelswhen dealing with the lifecycle router.
- Composability: A
- Consequences:
- Users are responsible for "snapping together" pieces (e.g., combining
WithFilterandDebounceHandler). lifecycleremains a toolkit of orthogonal primitives rather than a rigid framework.
- Users are responsible for "snapping together" pieces (e.g., combining
- Status: Proposed (Target: v1.8+)
- Context: The
lifecycleEvent Router currently handles transient signals and memory-based callbacks. To serve as a robust Event Broker for ecosystem projects (likeloamortrellis), it must support events that survive reboots. - Decision: Add extension points to the
Routerto support "Durable Sinks" without polluting the core API. The engine remains simple but allows state resumption from a persisted event stream. - Rationale: Inspired by distributed workflow engines, this allows temporal decoupling.
- Consequences: Enables
lifecycleto back orchestrators that require pause/resume semantics for long-running processes over days/weeks.
- Status: Proposed (Target: v1.8+)
- Context: Scaling workers currently treats all workers equally. In distributed environments, leader election or targeted scale-down requires identifying workers by their function.
- Decision: Incorporate a concept of "Roles" into the
Supervisor. e.g.,supervisor.AddWorker(w, role="background-sync"). - Rationale: Required for declarative stability and leader election, allowing nodes to dynamically enable/disable specific roles based on cluster consensus.
- Consequences:
lifecyclesteps closer to being a distributed control plane primitive rather than just a local process manager.
Date: 2026-02-22
- Status: Accepted
- Context: Reviewing highly robust automation drivers like
go-rod/rodreinforces the importance of "Chained Contexts" (contexts that intrinsically carry their own cancellation timeout/deadline specific to an action, without bleeding into parent lifecycles) and rigorous "Zombie Process Prevention" (cleaning up browser instances). - Decision:
- Strict Context Propagation:
lifecyclewill enforce that all long-running or external processes MUST accept a context derived from the specific Action/Worker, rather than relying solely on the globalRoutercontext. - Deep Process Hygiene (
prociovalidation): We reaffirm ADR-0001, but extend it:prociointegration MUST be routinely audited against "browser-like" or "daemon-like" child processes that actively try to detach. We will uselifecycleas the control plane to ensure even detached children created by WebDrivers or sub-shells are aggressively reaped upon lifecycle termination.
- Strict Context Propagation:
- Rationale: Validating our architecture against community standards (like
go-rod's process management) proves our abstraction (procio+lifecycle) is correct, but requires explicit documentation that context chaining is the preferred pattern for fine-grained timeout control inside workers. - Consequences:
- No breaking changes. This serves as an architectural reinforcement.
- Future enhancements to
lifecycle.Goorprocioprocess execution may introduce explicit nested timeout helpers similar torod'sTimeout()wrappers.
Date: 2026-02-26
- Status: Accepted
- Context:
prociointroduced I/O-specific observer hooks (OnIOError,OnScanError). Adding these to the baselifecycle.Observerinterface would force many users to implement methods they don't need (Process-only events) or break compatibility with existing implementations. - Decision: Implement an Optional Interface Discovery pattern via an internal
ProcioDiscoveryBridge. - Rationale:
- Zero Bloat: The base
lifecycle.Observerremains focused on lifecycle events (Logs, Panics, Process Start/Fail). - Structural Typing: We use anonymous structural interfaces within the bridge to "discover" if a user-provided observer has the required I/O methods.
- Ergonomics: If a user wants I/O events, they simply add the methods to their struct.
lifecycledetects them and connects the plumbing automatically.
- Zero Bloat: The base
- Consequences:
- Maintains Interface Segregation Principle (ISP).
- Eliminates semantic coupling between
lifecyclecontracts andprocio's low-level I/O mechanics. - Allows
lifecycleto act as a transparent proxy forprociowithout increasing the surface area of the core API.