ADR-061: Health checks as a pure pipeline module, not a blocking pipeline stage
Date: 2026-02-22 Status: Accepted
Context
Phase 21 adds a /concerns page that surfaces data quality flags, sizing risks,
and VMware best practice violations. Two architectural approaches were considered:
- Blocking pipeline stage — run health checks automatically during
ingest_file()orclassify_dataframe()and store findings in session. - On-demand page-level computation — call
run_health_checks(df)on every/concernspage visit, starting fromload_session_data().
Decision
Implement health checks as a pure pipeline module (pipeline/health_checks.py)
called on-demand from the /concerns page, never as an automatic pipeline step.
# concerns.py — every page visit
df = load_session_data()
result = run_health_checks(df)
HealthCheckResult is a local variable. It is never written to
app.storage.tab.
Rationale
The blocking stage approach would cache findings based on the initial classification. When a user edits a VM's workload category in the Review grid (e.g., changing "Unknown" to "Database"), the cached findings would remain stale until the next upload. The pre-sales workflow depends on findings reflecting the current workload assignments, not the initial classification.
Starting from load_session_data() ensures the correct workload_category values
(including user edits) are used for checks like "high Unknown VM ratio" and
"large Unknown VMs". If findings were recomputed on an old snapshot, the
"Unknown ratio" check could still fire even after the engineer has classified all
VMs correctly.
Consequences
- Positive: Findings always reflect the user's current edited state — no stale cache.
- Positive: No new session storage key required — findings are cheap to recompute (pure pandas comparisons, <10ms for typical 400-VM datasets).
- Positive: The module is fully testable with real DataFrames — no NiceGUI context required.
- Negative: Health checks run on every page visit rather than once per upload. Acceptable given the sub-millisecond cost of pandas boolean masks at typical dataset sizes.
- Constraint: The
/concernspage must never callclassify_dataframe()— that would discard user edits. It must always start fromload_session_data().