ADR-010: Use kardianos/service v1.2.4 for Windows SCM integration¶
Status: accepted Supersedes: ADR-007
Context¶
ADR-007 chose golang.org/x/sys/windows/svc and svc/mgr directly for Windows
Service Control Manager (SCM) integration on the premise that kardianos/service is a
thin wrapper that adds no functionality. Phase 5 research (2026-03-03) revised this
assessment after detailed examination of the library internals and the Windows SCM API
edge cases.
The research identified three significant gaps when using x/sys directly versus
kardianos/service v1.2.4:
-
SetRecoveryActionsOnNonCrashFailures—kardianos/servicecalls this automatically whenOnFailureis configured. Without it, SCM only restarts on crashes (unexpected termination), not on clean-exit errors likeos.Exit(1). A Go process that exits with an error viaos.Exit(1)reportsSERVICE_STOPPEDand does not trigger recovery unless this flag is set. -
DelayedAutoStart—kardianos/serviceexposes this as aKeyValueoption inservice.Config. Without delayed start, Go services with large init chains (net/http, TLS, Prometheus) can exceed the SCM 30-second startup timeout during boot under resource contention (documented in golang/go #23479). -
Edge cases in
Install()/Uninstall()—kardianos/servicehandles SCM error codes for "service already exists", "service marked for deletion", and "insufficient privilege" gracefully. Hand-written code usingsvc/mgrdirectly must replicate this error handling.
The concern in ADR-007 about adding an unnecessary dependency was valid at the time,
but kardianos/service is now at v1.2.4 (July 2025) — actively maintained, used by
production projects (k0s, Datadog agent forks) — and the implementation savings
outweigh the dependency cost.
Decision¶
Use github.com/kardianos/service v1.2.4 for Windows SCM integration.
Key configuration (in service.Config.Option KeyValue):
service.KeyValue{
"StartType": "automatic",
"DelayedAutoStart": true, // Avoids SCM 30s boot timeout
"OnFailure": "restart",
"OnFailureDelayDuration": "5s",
"OnFailureResetPeriod": 86400, // Reset failure count after 24h
}
The runWithServiceManager function in cmd/cee-exporter/service_windows.go handles:
installsubcommand →s.Install()uninstallsubcommand →s.Uninstall()- SCM context →
s.Run()(callsStart/Stopmethods)
The Stop() method cancels a context.Context passed into run() — this requires
run() to accept a context.Context argument (minor refactor from the Phase 4 stub).
Consequences¶
github.com/kardianos/service v1.2.4is added togo.mod. The library is pure Go, CGO-free, and has no transitive dependencies beyondgolang.org/x/sys(already present).SetRecoveryActionsOnNonCrashFailures(true)is set automatically — service restarts on any failure including cleanos.Exit(1)terminations.DelayedAutoStart: trueprevents Event ID 7009 (SCM startup timeout) during boot.- The
installsubcommand must strip itself fromArgumentsstored in the SCM registry to avoid an infinite install loop on boot. - Service installation requires Administrator privileges; this must be documented and detected at install time with a helpful error message.
- The
service_notwindows.goshim is unchanged — no impact on Linux or Docker builds.