Skip to content

Changelog

All notable changes to StorePredict are documented here.

[Unreleased]

[v7.2.0] - 2026-03-15

Changed

  • Compute sizing replaced with PreSizion redirect — the /compute page now links to PreSizion, a dedicated tool with advanced compute, storage, and network sizing. All compute sizing pipeline code, presets CSV, and associated tests have been removed. Session archives silently ignore legacy compute keys on restore. See ADR-076.

[v7.1.5] - 2026-02-26

Fixed

  • Upload endpoint 422 errorRequest was imported inside TYPE_CHECKING but from __future__ import annotations made FastAPI unable to resolve the type at runtime, causing every upload to silently return HTTP 422. Moved to a runtime import.
  • ZIP extraction too strict — only accepted files matching the canonical LiveOptics_*_VMWARE_*.xlsx pattern inside ZIPs. Now falls back to any .xlsx in the archive, supporting RVTools-in-zip and non-standard LiveOptics exports.
  • Silent upload errorsIngestionError during file processing was caught but never logged; added logger.warning so validation failures always appear in server logs. Error notifications now persist (timeout=0) instead of auto-dismissing.
  • Chunk assembly off-by-one guard — added max_end >= total_size check alongside byte-count comparison to handle potential Content-Range total mismatches from Quasar.

[v7.1.4] - 2026-02-26

Fixed

  • Chunked upload for corporate proxies — files are now uploaded in 2 MB chunks via a dedicated /api/upload/{token} endpoint instead of a single large multipart request. This resolves uploads being cut off mid-transfer (~60%) on enterprise networks with proxy timeout limits. A ui.timer polls the per-session queue and triggers the pipeline once all chunks are assembled server-side.

[v7.1.3] - 2026-02-25

Added

  • Open Sans in Excel report — all XlsxWriter cell formats now specify "Open Sans" (bold/header) or "Open Sans Light" (body/numbers), matching PDF report typography.

Fixed

  • PDF chart labels font — bar charts, pie chart, and Sankey diagram now render axis/slice labels in Open Sans via FONT_REGULAR constant imported from the shared _fonts.py module.
  • Rebase artifact — stray commit-message text left in pdf_charts.py line 238 by a rebase conflict caused a SyntaxError at runtime; removed.

[v7.1.2] - 2026-02-25

Added

  • Open Sans fonts bundledOpenSansLight.ttf and OpenSansSemiBold.ttf (OFL) shipped in data/; registered as AppFont/AppFontBd via _register_fonts() with automatic fallback to Vera when fonts are absent (test environments).
  • KPI card strip — totals section replaced by two rows of brand-blue KPI cards (_make_kpi_cards): VMs / CPUs / Memory on row 1, Provisioned / In-Use / Required on row 2. Values use a compact single-unit formatter (_fmt_kpi_storage: "5.2 TiB" instead of "5284.0 GiB (5.2 TiB)") to prevent wrapping at 17 pt.
  • Page-number footer_draw_footer draws a #cccccc rule and a centred grey page number on every page.
  • Section rules — 1.5 pt brand-blue HRFlowable added after every section heading (Totals, Averages, Performance, Breakdown, Health, Charts, Layout, Findings detail).
  • Health table orphan fix — health findings summary block wrapped in KeepTogether so the heading and severity table always land on the same page.
  • Datastore → VM styled header — per-datastore VM lists now use a single Table per datastore: row 0 is the DS name as a light-blue (#d0e8f4) spanning header with brand-blue bold text; rows below are the 3-column VM name grid.

Changed

  • Heading style parent changed from Heading2 to Normal (removes left indent); colour set to _BRAND_BLUE; font switched to AppFontBd (Open Sans SemiBold).
  • Dell logo no longer auto-injected from the bundled asset — logo only appears when the caller explicitly provides bytes.
  • All "Vera" / "VeraBd" literals in pdf_report.py replaced by _FONT_REGULAR / _FONT_BOLD constants.

[v7.1.1] - 2026-02-25

Fixed

  • PDF in container: replaced Plotly + kaleido Sankey with matplotlib Agg backend — kaleido 1.2.0 required a headless browser unavailable in the slim Docker image, causing PDF generation to fail silently. Cubic Bezier sigmoid flow bands rendered via FigureCanvasAgg give the same professional appearance without any system dependencies.

Removed

  • plotly>=5.0 and kaleido>=0.2 dependencies (replaced by matplotlib>=3.8)

[v7.1.0] - 2026-02-25

Changed

  • MkDocs navigation clean-up — individual ADR and Research pages are now declared not_in_nav; mkdocs build is warning-free. Pages are still built and fully reachable via their index tables (adr/index.md, research/index.md). Navigation collapses to index pages only, keeping the sidebar concise.

Internal

  • GSD planning files updated: v5.0 and v7.0 milestone accomplishments filled in, v7.0.x key decisions captured in PROJECT.md, stale Playwright references removed, phase directories 27–28 archived to milestones/v7.0-phases/.

[v7.0.7] - 2026-02-25

Added

  • Automatic dark mode — the app now follows the browser/OS prefers-color-scheme setting on first visit. Users who have never set a preference see dark mode automatically when their system is in dark mode. Explicit toggle preference is still persisted via app.storage.user and overrides auto-detection on subsequent visits.

[v7.0.6] - 2026-02-25

Changed

  • Single comprehensive PDF — the layout datastore detail (per-strategy tables with DS capacity, utilisation, IOPS, and workload types) is now appended directly to the main sizing report PDF. The separate Layout PDF download button is removed; one download from the Report page delivers everything.
  • Improved DS detail table formatting — workload-type column is now word-wrapped (no more 30-character truncation); column widths are rebalanced to fill the full A4 usable width (482 pt).
  • VM lists rendered as compact 3-column tables — replacing the previous plain comma-separated paragraph, making the assignment lists scannable at a glance.

Removed

  • generate_layout_pdf() function — superseded by the extended generate_report_pdf().
  • Layout PDF download button from the Layout page.

[v7.0.5] - 2026-02-25

Performance

  • Docker image ~389 MB smaller — eliminated the chown -R appuser:appuser /app layer by creating appuser before any COPY steps and using --chown=appuser:appuser on all COPY instructions. The venv is now created as appuser from the start; no ownership fixup layer is needed.

Changed

  • pyright moved to dev dependencies — it is a static type-checker, not a runtime dependency, and should not be installed in production containers.

[v7.0.4] - 2026-02-25

Changed

  • PDF charts: matplotlib Sankey replaced by Plotly + kaleido — the Sankey diagram in PDF exports is now rendered by plotly.graph_objects.Sankey and exported as PNG via kaleido, producing a cleaner, more professional output that closely matches the ECharts Sankey visible in the web UI.

Removed

  • Playwright / headless Chromium removed — PDF export no longer requires a browser. The existing ReportLab path (pdf_report.py) is now wired directly to both the Report and Layout download buttons. The HTML print routes (/report/print, /layout/print) and the one-time print-session token mechanism are deleted.
  • matplotlib removed — was only used for the Sankey diagram; replaced by Plotly.

Added

  • generate_layout_pdf() public function in pdf_report.py — standalone layout-recommendations PDF with optional PlacementConstraints parameter.

Dependencies

  • Added: plotly>=5.0, kaleido>=0.2
  • Removed: playwright>=1.40, matplotlib>=3.8

Docker

  • Image shrinks by ~430 MB — Playwright Chromium layer eliminated. (ADR-071)

[v7.0.3] - 2026-02-25

Dependencies

  • certifi 2026.1.4 → 2026.2.25
  • fastapi 0.132.0 → 0.133.0
  • hf-xet 1.2.0 → 1.3.1
  • litellm 1.81.14 → 1.81.15
  • mkdocs-material 9.7.2 → 9.7.3
  • nicegui 3.7.1 → 3.8.0
  • openai 2.21.0 → 2.24.0

[v7.0.2] - 2026-02-25

Fixed

  • PDF export broken in production container — Playwright's Chromium was installed to /root/.cache/ms-playwright (root) but the app runs as appuser, causing all PDF exports to fail with a browser-not-found error. Fixed by setting PLAYWRIGHT_BROWSERS_PATH=/ms-playwright in the Dockerfile and granting world read+execute on that path after install. (ADR-070)

[v7.0.1] - 2026-02-24

Performance

  • Docker build time ~5–10 s for code-only changes (down from 3+ minutes) — reordered Dockerfile layers so Python dependencies and Playwright are cached separately from source code. Uses uv sync --frozen --no-install-project to install deps before copying src/, BuildKit --mount=type=cache for the uv package cache, and UV_LINK_MODE=copy to suppress hardlink warnings.

[v7.0.0] - 2026-02-24

New features

  • Session save & restore — engineers can save a complete sizing session to a portable .zip archive and restore it later by re-uploading on the Upload page. The archive contains the original uploaded file plus a session.json snapshot capturing VM list, workload classifications, DRR overrides, layout settings, and compute settings. Works with all input formats (RVTools, LiveOptics xlsx/csv, dual-source merge).

  • Concerns remediation hints — every health finding card on /concerns now shows a concise actionable hint in italic gray text explaining what action to take (e.g., "Re-run RVTools after VMware Tools is installed to populate OS fields"). All 14 finding types across 13 health checks include hints.

  • Concerns PDF export — new "Export PDF" button on /concerns downloads a standalone A4 PDF report (ReportLab Platypus, Vera fonts) containing all findings with severity-colour-coded tables and remediation hint text. Independent of the main sizing report pipeline.

  • Concerns CSV export — new "Export CSV" button on /concerns downloads a UTF-8 BOM CSV (Excel-compatible) with one row per finding and columns: severity, check_id, title, detail, remediation, affected_count, cluster.

Bug fixes

  • Session restore: layout and compute pages no longer crash — after restoring a session saved before the layout or compute pages were visited, _load_constraints() and _load_compute_config() now use or-fallback defaults so that falsy restored values (0, 0.0, "") correctly resolve to page defaults (4 TB DS capacity, "R760" preset) instead of causing ValueError: Invalid value in ui.select().

Documentation

  • ADR-066: Session persistence via self-contained zip archive
  • ADR-067: SESSION_ZIP_SENTINEL to distinguish session archives from LiveOptics zips
  • ADR-068: Remediation hints as hardcoded English strings
  • ADR-069: Standalone concerns export as pure ReportLab
  • PRD updated to v7.0 (§4.11 session persistence, §4.12 concerns enhancements, updated user journey, milestone history)
  • Architecture updated: session persistence and concerns export modules

[v6.1.0] - 2026-02-23

Bug fixes

  • vMSC per-site sizing always available — the "Hosts per site (vMSC)" section no longer requires a Datacenter column with 2+ distinct values. The split ratio is applied to total VMs; the Datacenter column is informational only. Engineers can now use vMSC mode on any file, including single-datacenter or datacenter-less RVTools exports.

  • Type safety — fixed two mypy errors: unsafe int() cast on object in state.py, and union-attr on list | None iteration in layout_page.py.

  • LLM config teststest_llm_config_timeout_default and test_llm_config_max_concurrent_default now guard against shell-level LLM_TIMEOUT / LLM_MAX_CONCURRENT env vars overriding pydantic-settings defaults.

Documentation

  • ADR-064: Datacenter/cluster scope filtering as a dedicated pipeline stage
  • ADR-065: Windows Desktop OS fallback → VDI Linked Clone
  • PRD updated to v6.0 (§4.2b scope filtering, classification table, §11 shipped requirements)
  • Architecture updated: 5-stage pipeline, /scope in diagrams, session state scope helpers, rule count 43 → 50

[v6.0.0] - 2026-02-23

New features

  • Datacenter & cluster filtering — new /scope page (between upload and review) lets engineers select which datacenters and clusters to include in the analysis. All downstream pages (review, report, compute, layout, concerns) use the filtered dataset. Unselected VMs are preserved in session state so re-scoping never requires a re-upload. Scope badge shown in review/report headers; DC/cluster suffix appended to exported PDF and Excel filenames.

  • Improved workload classification — rule set updated from a 1,483-VM LiveOptics analysis that previously left 92 % of VMs in os_fallback:

  • Windows 10/11 desktop VMs now classify as VDI / Linked Clone instead of Virtual Machines (catches ~900 VMs in typical enterprise files)
  • New generic VDI rule (priority 224): VDI, DESKTOP, RDS, UAG, LOGINVSI, LOGINENTERPRISE
  • Containers: added TKG, HARBOR, and photon-*-kube regex for Tanzu node images
  • Email: added EXCHG abbreviation
  • File Content Servers: added SharePoint abbreviations SPBE, SPFE, SPOWA, SPOFFICE
  • Logging - Analytics: added LOGSTASH and KIBANA

Bug fixes

  • AG Grid reliability — explicit :valueGetter on vm_name column fixes silent field extraction failure after NiceGUI update_grid() cycles (AG Grid v34). typeof guard on :localeText prevents ReferenceError when CDN hasn't loaded. Grid refresh now uses run_grid_method("setGridOption") instead of update() for reliable data refresh without destroy/recreate cycles.
  • "New Analysis" button — clears session and navigates back to upload page without a full page reload.
  • Payload size_to_grid_rows() trims row data sent to AG Grid (~35 % smaller JSON on large files).
  • Type safety — fixed two mypy errors in state.py (unsafe int() cast on object) and layout_page.py (union-attr on list | None iteration).

[v5.0.0] - 2026-02-23

New features

  • Per-cluster compute breakdown — the /compute page now shows a breakdown table grouping host recommendations by cluster name when the RVTools file contains a Cluster column. A grand total row sums all clusters. Health check findings that apply per-cluster (HW version spread, HA ratio) display the cluster name alongside the finding on /concerns.

  • Health findings in exports — PDF report now includes a findings summary table (Critical / Warning / Info counts) on the main sizing page, and a dedicated findings detail appendix listing every finding sorted critical-first. Excel export includes a new "Findings" worksheet with columns: Finding, Severity, Category, Affected VMs, Detail, Cluster.

  • Configurable vMSC site split ratio — in vMSC (stretched cluster) mode, engineers can set any VM split percentage between sites (e.g. 60/40) instead of the fixed 50/50. The /compute settings panel exposes a 1–99% input visible only when vMSC is enabled. Site A and Site B host counts display as distinct labeled rows in the results card.

  • Configurable A/P DR active percentage — in Active/Passive DR mode, engineers can configure what percentage of VMs are active on the primary site (1–100%, default 100%). Secondary site is sized at 50% of the computed primary (cold standby convention).

  • PRD v5.0 — Product Requirements Document updated to reflect all v5.0 features, personas, and non-functional requirements.

[v4.0.1] - 2026-02-22

Bug fixes

  • Fix all event handlers on /compute — replaced .on("update:model-value") with .on_value_change(). Every control (preset selector, overcommit ratio, vMSC toggle, A/P toggle, spec inputs) was silently broken due to GenericEventArguments having no .value attribute; ValueChangeEventArguments does.
  • Ruff TC003 — moved from pathlib import Path into a TYPE_CHECKING block in compute_sizing.py (annotation-only use, safe with from __future__ import annotations).

UX improvements

  • vCPU / RAM breakdown in N+1 card — displays both sub-counts (e.g. "vCPU-based: 11 · RAM-based: 20") so users can see exactly which constraint binds and how the other count moves when adjusting the overcommit ratio.
  • Host spec inputs always visible — cores/socket, sockets, and RAM are no longer hidden behind a "Custom" preset selection; all three inputs appear at all times.
  • Preset auto-populate — selecting a named preset fills the spec fields from its base config; selecting "Custom" leaves the current field values untouched.
  • One preset per server model — dropdown simplified from 16 spec-laden variants (e.g. "R760 (2x28c / 512 GiB)") to 7 clean model names: R760, R770, R860, R960, R7725, XE7745, Custom.
  • Remove duplicate heading — "Configuration de l'hôte" was rendering twice in the settings panel; the redundant card label was removed.

[v4.0.0] - 2026-02-22

Grid UX improvements, per-VM hardware data, and a new health check concerns page.

Grid UX & VM Data (Phase 20)

  • Quick-filter search box above the VM review grid — filters all visible columns instantly on each keystroke via AG Grid quickFilterText
  • Column visibility panel — collapsible expansion above the grid with four checkboxes (vCPUs, RAM, Avg IOPS, Peak IOPS) toggling column visibility via setColumnsVisible; replaces AG Grid sidebar (Enterprise-only, unavailable in Community edition)
  • Hidden column definitions added to the VM grid: num_cpus, memory_mib, avg_iops, peak_iops — hidden by default, revealed on demand
  • Stable row identity — AG Grid getRowId switched from vm_name to String(params.data.row_index), fixing row corruption for customer files with duplicate VM names (linked clones, template copies)
  • row_index added to CANONICAL_COLUMNS and assigned as a contiguous integer in ingest_file() after template filtering
  • Cell-change and bulk-update handlers updated to match rows by row_index (int) instead of vm_name string comparison

Health Check & Concerns Page (Phase 21)

  • New /concerns page — surfaces data quality flags, sizing risks, and VMware best practice violations derived from the current session without re-classifying
  • 11 health checks across three categories:
  • Data Quality: missing OS, zero provisioned storage, missing vCPU/RAM, high powered-off VM ratio (>30%)
  • Sizing Risks: high Unknown VM ratio (>25%), large Unknown VMs (>1 TiB), single VM exceeding 100K IOPS/datastore budget
  • VMware Best Practices: no cluster assignment, old HW version (<vHW17 / ESXi 7.0), very old HW version (<vHW14 / ESXi 6.7, Critical), VMware Tools not installed (Critical), VMware Tools not running
  • Findings colour-coded by severity: Critical=red, Warning=yellow, Info=blue
  • Powered-off VMs and templates excluded from best-practice checks
  • hw_version=0 sentinel guard: LiveOptics exports skip hardware-version checks rather than falsely flagging every VM as old hardware
  • hw_version and tools_status added to CANONICAL_COLUMNS; RVTools parser reads them with graceful fallback (0 / "") when column absent
  • LiveOptics parser sets sentinel values hw_version=0, tools_status=""
  • Page uses load_session_data() — user edits from the Review grid are preserved; HealthCheckResult is never cached in session storage

Compute Sizing Module & Page (Phase 22)

  • New /compute page — reactive ESXi host count recommendations from the uploaded session data, with no re-ingestion; uses load_session_data() only
  • N+1 HA sizing — recommended host count = max(hosts_by_vcpu, hosts_by_ram) + 1 with configurable vCPU overcommit ratio (0.5–20.0, default 4.0)
  • vMSC (stretch cluster) mode — toggle reveals per-datacenter host counts; shows a warning card when no datacenter column data is available in the export
  • Active/Passive DR mode — toggle reveals primary site hosts and secondary site = ceil(primary / 2) (minimum 1)
  • 17 Dell PowerEdge presets loaded from compute_presets.csv (editable without code changes), covering:
  • R760 (Xeon 5th Gen: 28c, 32c, 48c variants)
  • R770 (Xeon 6 P-core: 6748P 48c, 6780P 64c, 6786P 86c)
  • R860/R960 (Xeon 5th Gen 4-socket: up to 56c/6 TiB)
  • R7725 (EPYC 9005 Turin: 9555 64c, 9655 96c, 9755 128c, 9955 192c Zen5c)
  • XE7745 AI server (EPYC 9005 Turin: 64c, 96c)
  • Custom (user-defined cores/socket, sockets, RAM)
  • Preset selector, overcommit input, and mode toggles are session-scoped (app.storage.tab); result cards refresh reactively on every change
  • Aggregate cards: active vCPU total, RAM total (GiB), excluded VM count
  • HostConfig, ComputeSizingResult frozen dataclasses; zero UI imports in pipeline/compute_sizing.py
  • load_presets(path) public function for loading alternate CSV files

LLM Classifier Enhancement

  • vm_description field (RVTools Annotation / LiveOptics Description) now included in LLM classifier prompts as an optional classification signal
  • Description truncated to 200 chars, newlines stripped; only included when non-empty to keep token usage lean

Tests

  • 49 new health check tests covering all 11 check IDs, sentinel guards, powered-off/template exclusion, and affected_vms tuple contract
  • 386 total tests passing

[v3.2.0] - 2026-02-22

Annotation-based VM classification for healthcare and application workloads.

Classifier

  • Fix two-pass classification logic: OS-fallback rules (priority ≥ 900) are now skipped in pass 1 when an annotation (vm_description) is present, allowing pass 2 to match richer annotation content before falling back to OS heuristics
  • Expand HealthCare/EMR-EHR rule with 25+ application keywords:
  • Radiology & imaging: PACS, INTELLISPACE, GLEAMER, AZMED, RAYVOLVE, TRAUMACAD
  • Hospital IS (French/Swiss & European ecosystem): OPALE, CARIATIDE, HANDYLIFE, POLYPOINT, MEDIDATA, DATABICS, PROCAMED, SEDIA, DGLAB, STERIGEST, WINSCRIBE, SYNLAB, EXOLIS, SCENARA, MIRTH, KODIP
  • Regex anchors: \bRIS\b (Radiology IS), \bSIEMS\b, \bHESTIA\b, Bloc-?Op
  • Add TOMCAT, FORTIWEB to Web Servers rule
  • Add PRTG to Logging/Analytics rule
  • Add APP VOLUMES / APPVOL to VDI Profiles rule
  • Add ALFRESCO to File Content Servers rule
  • Add FILEMAKER, CLARIS, SQLITE to MySQL/NoSQL rule
  • Word-boundary guards: SIEMS (avoids SIEMENS), HESTIA (avoids HestiaCP)

[v3.0.0] - 2026-02-21

Datastore layout recommendations for PowerStore sizing.

Layout Engine

  • Three layout strategies: Consolidation (BFD bin-packing), Performance (mission-critical isolation + tier BFD), Uniform (LPT equal distribution)
  • Multi-dimensional BFD algorithm respecting capacity, IOPS budget, and VM count constraints per datastore
  • Default 4 TiB datastores, 25 VMs/DS, 100K IOPS/DS (all tunable via PlacementConstraints)
  • Oversized VMs (>usable capacity) automatically placed in dedicated datastores
  • generate_all_proposals() public API returning all 3 strategy proposals

Default IOPS Estimates

  • Workload-based IOPS estimates for RVTools imports (no LiveOptics performance data)
  • 8 workload categories: Database/SQL (500), Oracle (800), SAP HANA (1000), VDI (30-50), generic VMs (50), File (100)
  • Configurable via src/store_predict/data/IOPS.csv (semicolon-delimited, same pattern as DRR.csv)
  • Hardcoded fallback when CSV is missing — tests remain independent

Documentation

  • ADR-059: Workload-based IOPS defaults for RVTools sizing
  • Research page: Default IOPS domain knowledge (sources, conservative bias, peak vs average)
  • Architecture docs updated with layout engine as 4th pipeline stage

Tests

  • 46+ layout engine tests covering BFD packing, 3 strategies, metrics, IOPS injection, CSV loading

[v2.2.0] - 2026-02-21

Observability, developer experience, and project health improvements.

LLM Classification Improvements

  • Live progress counter in UI notification during AI classification: "AI classification: 42 / 496 VMs"
  • on_progress callback added to classify_unknown_vms_async for UI integration
  • Ready-to-paste ClassificationRule(...) snippets now logged to server logs after LLM pass, allowing operators to promote LLM findings to deterministic rules without restarting

CI / GitHub

  • GitHub Release v2.1.0 created (was missing — tag existed but Release page had not been generated)
  • ci.yml: added permissions: contents: read (workflow security hardening)
  • ci.yml: added codecov/codecov-action@v5 upload step with CODECOV_TOKEN
  • ci.yml: added --cov-report=xml to generate Codecov-compatible report
  • Coverage measurement scoped to testable backend code (UI layer omitted — NiceGUI pages require a live server)
  • Effective coverage: 84% (up from misleading 51% that included untestable UI)

README

  • Added badges: CI, Docs, Release, Codecov coverage, Python version, Version
  • Fixed stale "29 classification rules" → 43 rules

Tests

246 tests passing (unchanged); ruff and mypy clean.

[v2.1.0] - 2026-02-20

Application-level DRR variants, DDVE support, and AI classification UI toggle.

DRR Reference Table (+14 entries, 28 → 42 total)

New subcategories covering application-layer encryption and compression scenarios where PowerStore's inline dedup/compression is partially or fully defeated:

  • Database / Oracle - HCC (App Compressed) → DRR 2.5
  • Database / Oracle - TDE (Encrypted) → DRR 1.5
  • Database / Oracle - HCC + TDE → DRR 1.2
  • Database / Microsoft SQL - Page Compressed → DRR 2.5
  • Database / Microsoft SQL - TDE (Encrypted) → DRR 1.5
  • Database / Microsoft SQL - Page Compressed + TDE → DRR 1.2
  • Database / MongoDB - Encrypted → DRR 1.3
  • Database / PostgreSQL - Encrypted → DRR 1.3
  • Database / My SQL / NoSQL - Encrypted → DRR 1.3
  • Containers / Kubernetes - Encrypted PVs → DRR 1.3
  • VM Replication / Commvault → DRR 1.5
  • VM Replication / Veeam - Compressed + Dedup → DRR 1.2
  • VM Replication / Commvault - Compressed + Dedup → DRR 1.2
  • VM Replication / Data Domain Virtual Edition (DDVE) → DRR 1.0 (already deduplicated — 1:1 at most)

Classifier (+14 rules, priorities 88–97 and 293–297)

Pattern matching for encrypted/compressed VM naming conventions. Combined scenarios (e.g. Oracle HCC + TDE) use regex lookaheads for AND matching. DDVE, Commvault, and compressed Veeam/Commvault variants also added.

AI Classification UI Toggle

Per-session ui.switch on the upload page to disable LLM classification without server restart. Greyed out with hint when LLM_ENABLED=false. State persisted in app.storage.tab["llm_ui_enabled"].

Documentation

  • ADR-052: Flat DRR override for non-PowerStore storage models
  • ADR-053: Application-level DRR degradation as CSV subcategory variants
  • ADR-054: AI classification toggle is per-session, not a server restart
  • Research phase 14: application-level data reduction findings with source references
  • architecture.md updated: storage model section, DRR/rule counts, session state

Tests

246 tests passing (up from 230); ruff and mypy clean.

[v2.0.0] - 2026-02-20

Multi-platform storage model selection — breaking UX change: DRR values now depend on the selected target storage platform, not only on workload type.

Target Storage Model Selector

  • New StorageModel enum in config.py: POWERSTORE (full dedup+compression, per-workload DRR), POWERFLEX (compression only, flat 2.0), POWERVAULT (no reduction, flat 1.0)
  • apply_storage_model() added to services/drr_table.py — overwrites per-VM DRR in session based on selected platform
  • get_storage_model() / set_storage_model() added to ui/state.py for tab-scoped session persistence
  • Review page now shows a ui.toggle selector (PowerStore / PowerFlex / PowerVault) above the summary stats; switching instantly recalculates all DRR values, refreshes the grid and stats
  • Model is applied at page load so navigating back from the report preserves the selection
  • Report page picks up overridden DRR values automatically — no changes required
  • 6 i18n keys added (storage_model.label, .powerstore, .powerflex, .powervault) in both en.yaml and fr.yaml
  • 3 new tests for apply_storage_model() (PowerVault→1.0, PowerFlex→2.0, PowerStore→table values); 230 tests passing, ruff and mypy clean

[v1.1] - 2026-02-20

i18n, Branding & Intelligence milestone.

Phase 13: Graphics (COMPLETE)

  • src/store_predict/services/charts.py — four ECharts option-dict builders: echart_sankey_options, echart_pie_options, echart_drr_bar_options, echart_before_after_options; all use Dell blue #007DB8 palette; Sankey falls back to grouped bar when fewer than 2 workload groups
  • src/store_predict/services/pdf_charts.py — four ReportLab/matplotlib builders: make_sankey_image_flowable (lazy matplotlib import, Spacer guard for empty data), make_pie_drawing, make_drr_bar_drawing, make_before_after_bar_drawing
  • report.py_build_charts_section() added: Sankey full-width, pie + DRR bar in two-column grid, before/after bar full-width; only rendered when workload groups exist
  • pdf_report.py — second PDF page added via PageBreak() + chart flowables; on_later_pages callback ensures Dell branded header on page 2
  • matplotlib>=3.8 confirmed in runtime dependencies; mypy overrides for matplotlib.* added
  • 6 i18n keys added across en.yaml and fr.yaml (pdf.charts_heading, pdf.sankey_title, pdf.pie_title, pdf.drr_bar_title, pdf.before_after_title, report.charts_heading)
  • 227 tests passing, ruff and mypy clean

Phase 12: UX Polish (COMPLETE)

  • Upload page refactored with spinner, linear progress bar, and run.io_bound pipeline offloading for a responsive event loop during 2-10 second processing
  • Persistent LLM ui.notification (spinner=True, timeout=None) updated in-place to positive/negative outcome instead of fire-and-forget notify
  • Review and report pages upgraded from plain links to card-with-CTA empty states (icon + label + button)
  • PDF and Excel download buttons now disable during generation and re-enable via try/finally guard
  • Company logo upload error message replaced with t("error.logo_upload_failed") i18n key
  • Added 8 new i18n keys across en.yaml and fr.yaml (upload.processing, llm.error, error.unexpected, error.logo_upload_failed)
  • Raw exception strings replaced with i18n messages across all user-facing error paths
  • All ui.notify() type values audited to canonical NiceGUI types (positive/negative/warning/info)
  • 20-test suite in test_ux_polish.py locking in UX patterns; full suite grows to 227 passed, 1 skipped

Phase 11: LLM Classification Fallback (COMPLETE)

  • LLMConfig pydantic-settings class reads 6 env vars (LLM_ENABLED/MODEL/API_KEY/API_BASE/TIMEOUT/MAX_CONCURRENT) with SecretStr masking for the API key
  • classify_unknown_vms_async async function filters only "default" confidence VMs, runs bounded concurrency via asyncio.Semaphore, and logs only counts (never VM names)
  • classify_single_vm applies input sanitization against prompt injection (truncate vm_name/os_name, strip newlines), asyncio timeout, and circuit breaker (3 failures -> 60s cooldown)
  • LLM fallback wired into upload pipeline behind llm_cfg.enabled guard — feature is opt-in via LLM_ENABLED=true env var, never active in CI
  • User notifications: persistent spinner before LLM pass, count notification after
  • docker-compose.yml updated with env_file (required: false) and LLM_* env var stubs pointing to OpenRouter/Mistral defaults
  • .env.example added and tracked in git as operator onboarding guide
  • 7-test suite for config and classifier; pydantic-settings added to runtime dependencies

Phase 10: PDF Branding (COMPLETE)

  • Dell partner logo PNG bundled as package data and loaded at import time (Docker-safe path resolution)
  • _preprocess_logo() normalizes any image mode (RGBA/RGB/P/JPEG) to RGBA PNG before ReportLab embedding, preventing black-background palette images
  • validate_logo() validates PNG/JPEG by extension, magic bytes, file size, and image dimensions, raising IngestionError for user-facing messages
  • generate_report_pdf() extended with backwards-compatible dell_logo_bytes and company_logo_bytes kwargs
  • Company logo upload UI: ui.upload card on report page accepting .png/.jpg/.jpeg up to 200 KB with remove button
  • Logo stored as base64 in app.storage.tab (tab-scoped session isolation) and decoded on PDF download
  • _on_download() passes decoded company_logo_bytes to generate_report_pdf(), embedding customer logo in PDF header
  • Pillow added to runtime dependencies; 16 branding tests + 11 logo UI wiring tests; 200 total tests

Phase 9: Excel Export (COMPLETE)

  • generate_report_xlsx(summary, project_name, locale) -> bytes pure function mirroring the PDF service shape (same locale param, same BytesIO pattern)
  • Three styled sheets: Summary (label-value metrics), Workload Breakdown (category subtotals + totals row), VM Detail (per-VM row with optional performance columns)
  • Brand blue (#1e3a5f) header row with white bold text, freeze panes at row 1, autofit columns on all sheets
  • Alternate row colouring on body rows; performance columns/rows gated on has_performance_data flag
  • 18 new excel.* i18n keys in both en.yaml and fr.yaml; EN and FR outputs verified to differ in bytes
  • Green "Download Excel Report" button wired on report page between PDF and Back buttons
  • _on_download_excel handler mirrors _on_download: assert summary type, generate bytes, sanitize filename, ui.download
  • XlsxWriter mypy override added; 8-test suite validating magic bytes, locale switching, performance guard, and sheet count

Phase 8.1: LiveOptics ZIP Extraction (COMPLETE)

  • ZIP accepted as a fourth upload format alongside .xlsx and .csv
  • extract_liveoptics_from_zip(content: bytes) -> tuple[bytes, str] module finds the LiveOptics xlsx member by case-insensitive regex pattern
  • Zip bomb guard rejects archives whose total uncompressed bytes exceed 100 MB (central directory header check, no extraction needed)
  • ZIP extraction runs before validate_upload() so extracted xlsx bytes go through existing validation logic unchanged
  • validation.py extended to accept "zip" extension and PK magic bytes; upload accept prop updated to .xlsx,.csv,.zip
  • 7-test suite with real in-memory zipfile objects covering happy path, pattern mismatch, no match, invalid zip, multiple members, and bomb guard
  • Zero regressions; 165 tests passing after addition

Phase 8: i18n Foundation (COMPLETE)

  • t() translation helper backed by python-i18n YAML files with %{variable_name} placeholder syntax
  • Tab-scoped get_locale() / set_locale() session helpers safe outside NiceGUI context (catches RuntimeError for pytest)
  • English and French YAML locale files with 73 strings across 8 namespaces (layout, upload, review, report, stats, dialog, columns, pdf)
  • add_locale_toggle() FR/EN toggle button triggering full page reload (required because ui.header cannot be in @ui.refreshable)
  • French is the default locale per project convention; toggle label shows the switch-target language
  • All 65 UI-layer strings in 8 files wrapped in t() calls; no hardcoded labels remain
  • AG Grid configured with French CDN locale pack (ag-grid-community/locale@32.2.2) and :localeText JS binding, injected only when locale is 'fr'
  • PDF localized: generate_report_pdf() accepts locale param; _i18n.set('locale', locale) called once before all t() calls
  • 13-test i18n unit suite covering EN/FR lookup, placeholder substitution, get_locale() safety, and PDF locale correctness

[v1.0] - 2026-02-19

MVP Sizing Tool milestone.

Phase 7: UI Bug Fixes & Report Enhancements (COMPLETE)

  • Fixed AG Grid "No Rows To Show" — NiceGUI requires : prefix for JS function properties
  • Fixed NaN serialization chain: NaN → None (not empty string) for JSON compatibility
  • NiceGUI auto-reload with __mp_main__ guard for multiprocessing
  • LiveOptics performance columns: Peak IOPS, 8K Eq. IOPS, Peak MB/s (conditional on data)
  • 8K IOPS normalization fix: throughput_KB/s / 8 (was double-counting with avg_iops)
  • Editable DRR column for custom overrides (min 0.1)
  • Bulk workload update: select multiple VMs via checkboxes, mass-assign workload category
  • Workload dropdown popup (cellEditorPopup: True) for readable category labels
  • Filtered select-all: header checkbox selects only visible (filtered) rows
  • CPU/memory metrics: num_cpus and memory_mib in parsers, calculation, report, and PDF
  • Report reorganized into Totals and Averages sections (web + PDF)
  • Replaced misleading "Total Peak IOPS" with "Hottest VM Peak IOPS" (single VM max)
  • WorkloadDialog fixed to accept plain strings (not dicts) for NiceGUI ui.select
  • 145 tests passing, 1 skipped

Phase 6: Polish, Docs & Deployment (COMPLETE)

  • Docker hardening: .dockerignore, HEALTHCHECK directive, env-var STORAGE_SECRET
  • Server-side file upload validation with magic-byte checks (XLSX zip header, CSV UTF-8)
  • Logging configuration with sanitization guidance (never log DataFrame contents)
  • Session isolation verification via app.storage.tab (tab-scoped)
  • Performance benchmark tests: 5000 VM classification < 10s, PDF generation < 5s
  • MkDocs documentation: architecture page with 3 Mermaid diagrams, getting-started guide
  • Project README with Docker and local dev quickstart
  • GitHub Actions CI: ruff check, ruff format, mypy, pytest on push/PR to main
  • GitHub Actions docs: MkDocs deployment to GitHub Pages on push to main
  • 15 new tests (validation + log sanitization + performance), 121 total tests passing

Phase 5: Calculation & PDF Report (COMPLETE)

  • Calculation service with per-VM required capacity (provisioned_mib / drr)
  • Workload grouping with subtotals (VM count, provisioned, in-use, required per category)
  • Weighted average DRR (total_provisioned / total_required, not simple average)
  • Division-by-zero guard: max(drr, 0.1) prevents invalid calculations
  • Missing field defaults via .get() for robustness with incomplete data
  • PDF report generator using ReportLab Platypus with branded one-page layout
  • Dark blue header bar with StorePredict branding
  • Workload breakdown table in PDF (Category, VMs, Provisioned, Avg DRR, Required)
  • Vera/VeraBd TTF fonts for French character support (accents, special chars)
  • Storage formatting helper: MiB to GiB with TiB display for large values
  • Report page at /report with summary cards and workload breakdown table
  • PDF download button triggering browser download
  • Navigation wiring: Review → Report button, Report link in nav bar
  • 24 new tests (12 calculation + 12 PDF), 106 total tests passing

Phase 4: UI — Upload & Review Pages (COMPLETE)

  • Session state module for per-tab DataFrame serialization (ui/state.py)
  • Upload page with file dropzone, project name input, pipeline integration
  • AG Grid VM table component with inline workload dropdown (ADR-007)
  • Multi-select workload dialog for assigning multiple workload types (ADR-009)
  • Summary statistics cards (Total VMs, Provisioned, Avg DRR, Effective Capacity)
  • Review page wiring all components: table, dialog, stats, DRR recalculation
  • Dark mode toggle with persistent user preference via app.storage.user (ADR-008)
  • Navigation header with Home, Upload, and Review links
  • Cell change handler: inline workload dropdown updates DRR and stats
  • Row click handler: multi-select dialog applies conservative (lowest) DRR
  • Per-tab session isolation for upload data, per-user storage for preferences (ADR-008)

Phase 3: Workload Classification Engine

  • Classification engine with 29 priority-ordered rules covering all 28 DRR subcategories
  • ClassificationRule dataclass with pattern matching on VM name and OS field
  • RuleRegistry with first-match-wins evaluation and confidence tracking
  • Substring matching (CADSRVSQL001 -> SQL) with false positive prevention
  • OS-based fallback rules (Windows Server -> Virtual Machines)
  • classify_dataframe() for bulk DataFrame classification
  • 0% Unknown rate on 594 real LiveOptics VMs (target was <20%)
  • 28 unit tests + 11 integration tests with real sample data

Phase 2: File Ingestion Pipeline

  • RVTools .xlsx parser (vInfo tab)
  • LiveOptics .xlsx and .csv parsers (VMs tab)
  • Format auto-detection based on sheet names and column headers
  • Column alias resolution for name variations
  • Template VM filtering
  • Unified ingest_file() orchestrator
  • 29 ingestion tests with real sample files

Phase 1: Project Foundation & DRR Table

  • Python project structure with src layout
  • DRR table service loading 28 workload categories from CSV
  • Data models (VM, FileFormat, WorkloadCategory)
  • NiceGUI app skeleton with page routing
  • ruff + mypy configuration
  • pytest setup with 14 initial tests
  • Dockerfile + docker-compose.yml