Changelog
All notable changes to StorePredict are documented here.
[Unreleased]
[v7.2.0] - 2026-03-15
Changed
- Compute sizing replaced with PreSizion redirect — the
/computepage now links to PreSizion, a dedicated tool with advanced compute, storage, and network sizing. All compute sizing pipeline code, presets CSV, and associated tests have been removed. Session archives silently ignore legacy compute keys on restore. See ADR-076.
[v7.1.5] - 2026-02-26
Fixed
- Upload endpoint 422 error —
Requestwas imported insideTYPE_CHECKINGbutfrom __future__ import annotationsmade FastAPI unable to resolve the type at runtime, causing every upload to silently return HTTP 422. Moved to a runtime import. - ZIP extraction too strict — only accepted files matching the canonical
LiveOptics_*_VMWARE_*.xlsxpattern inside ZIPs. Now falls back to any.xlsxin the archive, supporting RVTools-in-zip and non-standard LiveOptics exports. - Silent upload errors —
IngestionErrorduring file processing was caught but never logged; addedlogger.warningso validation failures always appear in server logs. Error notifications now persist (timeout=0) instead of auto-dismissing. - Chunk assembly off-by-one guard — added
max_end >= total_sizecheck alongside byte-count comparison to handle potential Content-Range total mismatches from Quasar.
[v7.1.4] - 2026-02-26
Fixed
- Chunked upload for corporate proxies — files are now uploaded in 2 MB chunks via a dedicated
/api/upload/{token}endpoint instead of a single large multipart request. This resolves uploads being cut off mid-transfer (~60%) on enterprise networks with proxy timeout limits. Aui.timerpolls the per-session queue and triggers the pipeline once all chunks are assembled server-side.
[v7.1.3] - 2026-02-25
Added
- Open Sans in Excel report — all XlsxWriter cell formats now specify
"Open Sans"(bold/header) or"Open Sans Light"(body/numbers), matching PDF report typography.
Fixed
- PDF chart labels font — bar charts, pie chart, and Sankey diagram now render axis/slice labels in Open Sans via
FONT_REGULARconstant imported from the shared_fonts.pymodule. - Rebase artifact — stray commit-message text left in
pdf_charts.pyline 238 by a rebase conflict caused aSyntaxErrorat runtime; removed.
[v7.1.2] - 2026-02-25
Added
- Open Sans fonts bundled —
OpenSansLight.ttfandOpenSansSemiBold.ttf(OFL) shipped indata/; registered asAppFont/AppFontBdvia_register_fonts()with automatic fallback to Vera when fonts are absent (test environments). - KPI card strip — totals section replaced by two rows of brand-blue KPI cards (
_make_kpi_cards): VMs / CPUs / Memory on row 1, Provisioned / In-Use / Required on row 2. Values use a compact single-unit formatter (_fmt_kpi_storage:"5.2 TiB"instead of"5284.0 GiB (5.2 TiB)") to prevent wrapping at 17 pt. - Page-number footer —
_draw_footerdraws a#ccccccrule and a centred grey page number on every page. - Section rules — 1.5 pt brand-blue
HRFlowableadded after every section heading (Totals, Averages, Performance, Breakdown, Health, Charts, Layout, Findings detail). - Health table orphan fix — health findings summary block wrapped in
KeepTogetherso the heading and severity table always land on the same page. - Datastore → VM styled header — per-datastore VM lists now use a single
Tableper datastore: row 0 is the DS name as a light-blue (#d0e8f4) spanning header with brand-blue bold text; rows below are the 3-column VM name grid.
Changed
- Heading style parent changed from
Heading2toNormal(removes left indent); colour set to_BRAND_BLUE; font switched toAppFontBd(Open Sans SemiBold). - Dell logo no longer auto-injected from the bundled asset — logo only appears when the caller explicitly provides bytes.
- All
"Vera"/"VeraBd"literals inpdf_report.pyreplaced by_FONT_REGULAR/_FONT_BOLDconstants.
[v7.1.1] - 2026-02-25
Fixed
- PDF in container: replaced Plotly + kaleido Sankey with matplotlib Agg backend — kaleido 1.2.0 required a headless browser unavailable in the slim Docker image, causing PDF generation to fail silently. Cubic Bezier sigmoid flow bands rendered via
FigureCanvasAgggive the same professional appearance without any system dependencies.
Removed
plotly>=5.0andkaleido>=0.2dependencies (replaced bymatplotlib>=3.8)
[v7.1.0] - 2026-02-25
Changed
- MkDocs navigation clean-up — individual ADR and Research pages are now
declared
not_in_nav;mkdocs buildis warning-free. Pages are still built and fully reachable via their index tables (adr/index.md,research/index.md). Navigation collapses to index pages only, keeping the sidebar concise.
Internal
- GSD planning files updated: v5.0 and v7.0 milestone accomplishments filled in,
v7.0.x key decisions captured in PROJECT.md, stale Playwright references
removed, phase directories 27–28 archived to
milestones/v7.0-phases/.
[v7.0.7] - 2026-02-25
Added
- Automatic dark mode — the app now follows the browser/OS
prefers-color-schemesetting on first visit. Users who have never set a preference see dark mode automatically when their system is in dark mode. Explicit toggle preference is still persisted viaapp.storage.userand overrides auto-detection on subsequent visits.
[v7.0.6] - 2026-02-25
Changed
- Single comprehensive PDF — the layout datastore detail (per-strategy tables with DS capacity, utilisation, IOPS, and workload types) is now appended directly to the main sizing report PDF. The separate Layout PDF download button is removed; one download from the Report page delivers everything.
- Improved DS detail table formatting — workload-type column is now word-wrapped (no more 30-character truncation); column widths are rebalanced to fill the full A4 usable width (482 pt).
- VM lists rendered as compact 3-column tables — replacing the previous plain comma-separated paragraph, making the assignment lists scannable at a glance.
Removed
generate_layout_pdf()function — superseded by the extendedgenerate_report_pdf().- Layout PDF download button from the Layout page.
[v7.0.5] - 2026-02-25
Performance
- Docker image ~389 MB smaller — eliminated the
chown -R appuser:appuser /applayer by creatingappuserbefore anyCOPYsteps and using--chown=appuser:appuseron allCOPYinstructions. The venv is now created asappuserfrom the start; no ownership fixup layer is needed.
Changed
pyrightmoved to dev dependencies — it is a static type-checker, not a runtime dependency, and should not be installed in production containers.
[v7.0.4] - 2026-02-25
Changed
- PDF charts: matplotlib Sankey replaced by Plotly + kaleido — the Sankey diagram
in PDF exports is now rendered by
plotly.graph_objects.Sankeyand exported as PNG via kaleido, producing a cleaner, more professional output that closely matches the ECharts Sankey visible in the web UI.
Removed
- Playwright / headless Chromium removed — PDF export no longer requires a browser.
The existing ReportLab path (
pdf_report.py) is now wired directly to both the Report and Layout download buttons. The HTML print routes (/report/print,/layout/print) and the one-time print-session token mechanism are deleted. - matplotlib removed — was only used for the Sankey diagram; replaced by Plotly.
Added
generate_layout_pdf()public function inpdf_report.py— standalone layout-recommendations PDF with optionalPlacementConstraintsparameter.
Dependencies
- Added:
plotly>=5.0,kaleido>=0.2 - Removed:
playwright>=1.40,matplotlib>=3.8
Docker
- Image shrinks by ~430 MB — Playwright Chromium layer eliminated. (ADR-071)
[v7.0.3] - 2026-02-25
Dependencies
- certifi 2026.1.4 → 2026.2.25
- fastapi 0.132.0 → 0.133.0
- hf-xet 1.2.0 → 1.3.1
- litellm 1.81.14 → 1.81.15
- mkdocs-material 9.7.2 → 9.7.3
- nicegui 3.7.1 → 3.8.0
- openai 2.21.0 → 2.24.0
[v7.0.2] - 2026-02-25
Fixed
- PDF export broken in production container — Playwright's Chromium was
installed to
/root/.cache/ms-playwright(root) but the app runs asappuser, causing all PDF exports to fail with a browser-not-found error. Fixed by settingPLAYWRIGHT_BROWSERS_PATH=/ms-playwrightin the Dockerfile and granting world read+execute on that path after install. (ADR-070)
[v7.0.1] - 2026-02-24
Performance
- Docker build time ~5–10 s for code-only changes (down from 3+ minutes) —
reordered Dockerfile layers so Python dependencies and Playwright are cached
separately from source code. Uses
uv sync --frozen --no-install-projectto install deps before copyingsrc/, BuildKit--mount=type=cachefor the uv package cache, andUV_LINK_MODE=copyto suppress hardlink warnings.
[v7.0.0] - 2026-02-24
New features
-
Session save & restore — engineers can save a complete sizing session to a portable
.ziparchive and restore it later by re-uploading on the Upload page. The archive contains the original uploaded file plus asession.jsonsnapshot capturing VM list, workload classifications, DRR overrides, layout settings, and compute settings. Works with all input formats (RVTools, LiveOptics xlsx/csv, dual-source merge). -
Concerns remediation hints — every health finding card on
/concernsnow shows a concise actionable hint in italic gray text explaining what action to take (e.g., "Re-run RVTools after VMware Tools is installed to populate OS fields"). All 14 finding types across 13 health checks include hints. -
Concerns PDF export — new "Export PDF" button on
/concernsdownloads a standalone A4 PDF report (ReportLab Platypus, Vera fonts) containing all findings with severity-colour-coded tables and remediation hint text. Independent of the main sizing report pipeline. -
Concerns CSV export — new "Export CSV" button on
/concernsdownloads a UTF-8 BOM CSV (Excel-compatible) with one row per finding and columns: severity, check_id, title, detail, remediation, affected_count, cluster.
Bug fixes
- Session restore: layout and compute pages no longer crash — after
restoring a session saved before the layout or compute pages were visited,
_load_constraints()and_load_compute_config()now useor-fallback defaults so that falsy restored values (0,0.0,"") correctly resolve to page defaults (4 TB DS capacity, "R760" preset) instead of causingValueError: Invalid valueinui.select().
Documentation
- ADR-066: Session persistence via self-contained zip archive
- ADR-067: SESSION_ZIP_SENTINEL to distinguish session archives from LiveOptics zips
- ADR-068: Remediation hints as hardcoded English strings
- ADR-069: Standalone concerns export as pure ReportLab
- PRD updated to v7.0 (§4.11 session persistence, §4.12 concerns enhancements, updated user journey, milestone history)
- Architecture updated: session persistence and concerns export modules
[v6.1.0] - 2026-02-23
Bug fixes
-
vMSC per-site sizing always available — the "Hosts per site (vMSC)" section no longer requires a Datacenter column with 2+ distinct values. The split ratio is applied to total VMs; the Datacenter column is informational only. Engineers can now use vMSC mode on any file, including single-datacenter or datacenter-less RVTools exports.
-
Type safety — fixed two mypy errors: unsafe
int()cast onobjectinstate.py, and union-attr onlist | Noneiteration inlayout_page.py. -
LLM config tests —
test_llm_config_timeout_defaultandtest_llm_config_max_concurrent_defaultnow guard against shell-levelLLM_TIMEOUT/LLM_MAX_CONCURRENTenv vars overriding pydantic-settings defaults.
Documentation
- ADR-064: Datacenter/cluster scope filtering as a dedicated pipeline stage
- ADR-065: Windows Desktop OS fallback → VDI Linked Clone
- PRD updated to v6.0 (§4.2b scope filtering, classification table, §11 shipped requirements)
- Architecture updated: 5-stage pipeline,
/scopein diagrams, session state scope helpers, rule count 43 → 50
[v6.0.0] - 2026-02-23
New features
-
Datacenter & cluster filtering — new
/scopepage (between upload and review) lets engineers select which datacenters and clusters to include in the analysis. All downstream pages (review, report, compute, layout, concerns) use the filtered dataset. Unselected VMs are preserved in session state so re-scoping never requires a re-upload. Scope badge shown in review/report headers; DC/cluster suffix appended to exported PDF and Excel filenames. -
Improved workload classification — rule set updated from a 1,483-VM LiveOptics analysis that previously left 92 % of VMs in
os_fallback: - Windows 10/11 desktop VMs now classify as VDI / Linked Clone instead of Virtual Machines (catches ~900 VMs in typical enterprise files)
- New generic VDI rule (priority 224):
VDI,DESKTOP,RDS,UAG,LOGINVSI,LOGINENTERPRISE - Containers: added
TKG,HARBOR, andphoton-*-kuberegex for Tanzu node images - Email: added
EXCHGabbreviation - File Content Servers: added SharePoint abbreviations
SPBE,SPFE,SPOWA,SPOFFICE - Logging - Analytics: added
LOGSTASHandKIBANA
Bug fixes
- AG Grid reliability — explicit
:valueGetteronvm_namecolumn fixes silent field extraction failure after NiceGUIupdate_grid()cycles (AG Grid v34).typeofguard on:localeTextprevents ReferenceError when CDN hasn't loaded. Grid refresh now usesrun_grid_method("setGridOption")instead ofupdate()for reliable data refresh without destroy/recreate cycles. - "New Analysis" button — clears session and navigates back to upload page without a full page reload.
- Payload size —
_to_grid_rows()trims row data sent to AG Grid (~35 % smaller JSON on large files). - Type safety — fixed two mypy errors in
state.py(unsafeint()cast onobject) andlayout_page.py(union-attr onlist | Noneiteration).
[v5.0.0] - 2026-02-23
New features
-
Per-cluster compute breakdown — the
/computepage now shows a breakdown table grouping host recommendations by cluster name when the RVTools file contains a Cluster column. A grand total row sums all clusters. Health check findings that apply per-cluster (HW version spread, HA ratio) display the cluster name alongside the finding on/concerns. -
Health findings in exports — PDF report now includes a findings summary table (Critical / Warning / Info counts) on the main sizing page, and a dedicated findings detail appendix listing every finding sorted critical-first. Excel export includes a new "Findings" worksheet with columns: Finding, Severity, Category, Affected VMs, Detail, Cluster.
-
Configurable vMSC site split ratio — in vMSC (stretched cluster) mode, engineers can set any VM split percentage between sites (e.g. 60/40) instead of the fixed 50/50. The
/computesettings panel exposes a 1–99% input visible only when vMSC is enabled. Site A and Site B host counts display as distinct labeled rows in the results card. -
Configurable A/P DR active percentage — in Active/Passive DR mode, engineers can configure what percentage of VMs are active on the primary site (1–100%, default 100%). Secondary site is sized at 50% of the computed primary (cold standby convention).
-
PRD v5.0 — Product Requirements Document updated to reflect all v5.0 features, personas, and non-functional requirements.
[v4.0.1] - 2026-02-22
Bug fixes
- Fix all event handlers on
/compute— replaced.on("update:model-value")with.on_value_change(). Every control (preset selector, overcommit ratio, vMSC toggle, A/P toggle, spec inputs) was silently broken due toGenericEventArgumentshaving no.valueattribute;ValueChangeEventArgumentsdoes. - Ruff TC003 — moved
from pathlib import Pathinto aTYPE_CHECKINGblock incompute_sizing.py(annotation-only use, safe withfrom __future__ import annotations).
UX improvements
- vCPU / RAM breakdown in N+1 card — displays both sub-counts (e.g. "vCPU-based: 11 · RAM-based: 20") so users can see exactly which constraint binds and how the other count moves when adjusting the overcommit ratio.
- Host spec inputs always visible — cores/socket, sockets, and RAM are no longer hidden behind a "Custom" preset selection; all three inputs appear at all times.
- Preset auto-populate — selecting a named preset fills the spec fields from its base config; selecting "Custom" leaves the current field values untouched.
- One preset per server model — dropdown simplified from 16 spec-laden variants (e.g. "R760 (2x28c / 512 GiB)") to 7 clean model names: R760, R770, R860, R960, R7725, XE7745, Custom.
- Remove duplicate heading — "Configuration de l'hôte" was rendering twice in the settings panel; the redundant card label was removed.
[v4.0.0] - 2026-02-22
Grid UX improvements, per-VM hardware data, and a new health check concerns page.
Grid UX & VM Data (Phase 20)
- Quick-filter search box above the VM review grid — filters all visible columns
instantly on each keystroke via AG Grid
quickFilterText - Column visibility panel — collapsible expansion above the grid with four
checkboxes (vCPUs, RAM, Avg IOPS, Peak IOPS) toggling column visibility via
setColumnsVisible; replaces AG Grid sidebar (Enterprise-only, unavailable in Community edition) - Hidden column definitions added to the VM grid:
num_cpus,memory_mib,avg_iops,peak_iops— hidden by default, revealed on demand - Stable row identity — AG Grid
getRowIdswitched fromvm_nametoString(params.data.row_index), fixing row corruption for customer files with duplicate VM names (linked clones, template copies) row_indexadded toCANONICAL_COLUMNSand assigned as a contiguous integer iningest_file()after template filtering- Cell-change and bulk-update handlers updated to match rows by
row_index(int) instead ofvm_namestring comparison
Health Check & Concerns Page (Phase 21)
- New
/concernspage — surfaces data quality flags, sizing risks, and VMware best practice violations derived from the current session without re-classifying - 11 health checks across three categories:
- Data Quality: missing OS, zero provisioned storage, missing vCPU/RAM, high powered-off VM ratio (>30%)
- Sizing Risks: high Unknown VM ratio (>25%), large Unknown VMs (>1 TiB), single VM exceeding 100K IOPS/datastore budget
- VMware Best Practices: no cluster assignment, old HW version (<vHW17 / ESXi 7.0), very old HW version (<vHW14 / ESXi 6.7, Critical), VMware Tools not installed (Critical), VMware Tools not running
- Findings colour-coded by severity: Critical=red, Warning=yellow, Info=blue
- Powered-off VMs and templates excluded from best-practice checks
hw_version=0sentinel guard: LiveOptics exports skip hardware-version checks rather than falsely flagging every VM as old hardwarehw_versionandtools_statusadded toCANONICAL_COLUMNS; RVTools parser reads them with graceful fallback (0 / "") when column absent- LiveOptics parser sets sentinel values
hw_version=0,tools_status="" - Page uses
load_session_data()— user edits from the Review grid are preserved;HealthCheckResultis never cached in session storage
Compute Sizing Module & Page (Phase 22)
- New
/computepage — reactive ESXi host count recommendations from the uploaded session data, with no re-ingestion; usesload_session_data()only - N+1 HA sizing — recommended host count =
max(hosts_by_vcpu, hosts_by_ram) + 1with configurable vCPU overcommit ratio (0.5–20.0, default 4.0) - vMSC (stretch cluster) mode — toggle reveals per-datacenter host counts; shows a warning card when no datacenter column data is available in the export
- Active/Passive DR mode — toggle reveals primary site hosts and secondary
site =
ceil(primary / 2)(minimum 1) - 17 Dell PowerEdge presets loaded from
compute_presets.csv(editable without code changes), covering: - R760 (Xeon 5th Gen: 28c, 32c, 48c variants)
- R770 (Xeon 6 P-core: 6748P 48c, 6780P 64c, 6786P 86c)
- R860/R960 (Xeon 5th Gen 4-socket: up to 56c/6 TiB)
- R7725 (EPYC 9005 Turin: 9555 64c, 9655 96c, 9755 128c, 9955 192c Zen5c)
- XE7745 AI server (EPYC 9005 Turin: 64c, 96c)
- Custom (user-defined cores/socket, sockets, RAM)
- Preset selector, overcommit input, and mode toggles are session-scoped
(
app.storage.tab); result cards refresh reactively on every change - Aggregate cards: active vCPU total, RAM total (GiB), excluded VM count
HostConfig,ComputeSizingResultfrozen dataclasses; zero UI imports inpipeline/compute_sizing.pyload_presets(path)public function for loading alternate CSV files
LLM Classifier Enhancement
vm_descriptionfield (RVTools Annotation / LiveOptics Description) now included in LLM classifier prompts as an optional classification signal- Description truncated to 200 chars, newlines stripped; only included when non-empty to keep token usage lean
Tests
- 49 new health check tests covering all 11 check IDs, sentinel guards, powered-off/template exclusion, and affected_vms tuple contract
- 386 total tests passing
[v3.2.0] - 2026-02-22
Annotation-based VM classification for healthcare and application workloads.
Classifier
- Fix two-pass classification logic: OS-fallback rules (priority ≥ 900) are now
skipped in pass 1 when an annotation (
vm_description) is present, allowing pass 2 to match richer annotation content before falling back to OS heuristics - Expand HealthCare/EMR-EHR rule with 25+ application keywords:
- Radiology & imaging: PACS, INTELLISPACE, GLEAMER, AZMED, RAYVOLVE, TRAUMACAD
- Hospital IS (French/Swiss & European ecosystem): OPALE, CARIATIDE, HANDYLIFE, POLYPOINT, MEDIDATA, DATABICS, PROCAMED, SEDIA, DGLAB, STERIGEST, WINSCRIBE, SYNLAB, EXOLIS, SCENARA, MIRTH, KODIP
- Regex anchors:
\bRIS\b(Radiology IS),\bSIEMS\b,\bHESTIA\b,Bloc-?Op - Add
TOMCAT,FORTIWEBto Web Servers rule - Add
PRTGto Logging/Analytics rule - Add
APP VOLUMES/APPVOLto VDI Profiles rule - Add
ALFRESCOto File Content Servers rule - Add
FILEMAKER,CLARIS,SQLITEto MySQL/NoSQL rule - Word-boundary guards: SIEMS (avoids SIEMENS), HESTIA (avoids HestiaCP)
[v3.0.0] - 2026-02-21
Datastore layout recommendations for PowerStore sizing.
Layout Engine
- Three layout strategies: Consolidation (BFD bin-packing), Performance (mission-critical isolation + tier BFD), Uniform (LPT equal distribution)
- Multi-dimensional BFD algorithm respecting capacity, IOPS budget, and VM count constraints per datastore
- Default 4 TiB datastores, 25 VMs/DS, 100K IOPS/DS (all tunable via PlacementConstraints)
- Oversized VMs (>usable capacity) automatically placed in dedicated datastores
generate_all_proposals()public API returning all 3 strategy proposals
Default IOPS Estimates
- Workload-based IOPS estimates for RVTools imports (no LiveOptics performance data)
- 8 workload categories: Database/SQL (500), Oracle (800), SAP HANA (1000), VDI (30-50), generic VMs (50), File (100)
- Configurable via
src/store_predict/data/IOPS.csv(semicolon-delimited, same pattern as DRR.csv) - Hardcoded fallback when CSV is missing — tests remain independent
Documentation
- ADR-059: Workload-based IOPS defaults for RVTools sizing
- Research page: Default IOPS domain knowledge (sources, conservative bias, peak vs average)
- Architecture docs updated with layout engine as 4th pipeline stage
Tests
- 46+ layout engine tests covering BFD packing, 3 strategies, metrics, IOPS injection, CSV loading
[v2.2.0] - 2026-02-21
Observability, developer experience, and project health improvements.
LLM Classification Improvements
- Live progress counter in UI notification during AI classification: "AI classification: 42 / 496 VMs"
on_progresscallback added toclassify_unknown_vms_asyncfor UI integration- Ready-to-paste
ClassificationRule(...)snippets now logged to server logs after LLM pass, allowing operators to promote LLM findings to deterministic rules without restarting
CI / GitHub
- GitHub Release v2.1.0 created (was missing — tag existed but Release page had not been generated)
ci.yml: addedpermissions: contents: read(workflow security hardening)ci.yml: addedcodecov/codecov-action@v5upload step withCODECOV_TOKENci.yml: added--cov-report=xmlto generate Codecov-compatible report- Coverage measurement scoped to testable backend code (UI layer omitted — NiceGUI pages require a live server)
- Effective coverage: 84% (up from misleading 51% that included untestable UI)
README
- Added badges: CI, Docs, Release, Codecov coverage, Python version, Version
- Fixed stale "29 classification rules" → 43 rules
Tests
246 tests passing (unchanged); ruff and mypy clean.
[v2.1.0] - 2026-02-20
Application-level DRR variants, DDVE support, and AI classification UI toggle.
DRR Reference Table (+14 entries, 28 → 42 total)
New subcategories covering application-layer encryption and compression scenarios where PowerStore's inline dedup/compression is partially or fully defeated:
Database / Oracle - HCC (App Compressed)→ DRR 2.5Database / Oracle - TDE (Encrypted)→ DRR 1.5Database / Oracle - HCC + TDE→ DRR 1.2Database / Microsoft SQL - Page Compressed→ DRR 2.5Database / Microsoft SQL - TDE (Encrypted)→ DRR 1.5Database / Microsoft SQL - Page Compressed + TDE→ DRR 1.2Database / MongoDB - Encrypted→ DRR 1.3Database / PostgreSQL - Encrypted→ DRR 1.3Database / My SQL / NoSQL - Encrypted→ DRR 1.3Containers / Kubernetes - Encrypted PVs→ DRR 1.3VM Replication / Commvault→ DRR 1.5VM Replication / Veeam - Compressed + Dedup→ DRR 1.2VM Replication / Commvault - Compressed + Dedup→ DRR 1.2VM Replication / Data Domain Virtual Edition (DDVE)→ DRR 1.0 (already deduplicated — 1:1 at most)
Classifier (+14 rules, priorities 88–97 and 293–297)
Pattern matching for encrypted/compressed VM naming conventions. Combined scenarios (e.g. Oracle HCC + TDE) use regex lookaheads for AND matching. DDVE, Commvault, and compressed Veeam/Commvault variants also added.
AI Classification UI Toggle
Per-session ui.switch on the upload page to disable LLM classification without
server restart. Greyed out with hint when LLM_ENABLED=false. State persisted in
app.storage.tab["llm_ui_enabled"].
Documentation
- ADR-052: Flat DRR override for non-PowerStore storage models
- ADR-053: Application-level DRR degradation as CSV subcategory variants
- ADR-054: AI classification toggle is per-session, not a server restart
- Research phase 14: application-level data reduction findings with source references
architecture.mdupdated: storage model section, DRR/rule counts, session state
Tests
246 tests passing (up from 230); ruff and mypy clean.
[v2.0.0] - 2026-02-20
Multi-platform storage model selection — breaking UX change: DRR values now depend on the selected target storage platform, not only on workload type.
Target Storage Model Selector
- New
StorageModelenum inconfig.py:POWERSTORE(full dedup+compression, per-workload DRR),POWERFLEX(compression only, flat 2.0),POWERVAULT(no reduction, flat 1.0) apply_storage_model()added toservices/drr_table.py— overwrites per-VM DRR in session based on selected platformget_storage_model()/set_storage_model()added toui/state.pyfor tab-scoped session persistence- Review page now shows a
ui.toggleselector (PowerStore / PowerFlex / PowerVault) above the summary stats; switching instantly recalculates all DRR values, refreshes the grid and stats - Model is applied at page load so navigating back from the report preserves the selection
- Report page picks up overridden DRR values automatically — no changes required
- 6 i18n keys added (
storage_model.label,.powerstore,.powerflex,.powervault) in bothen.yamlandfr.yaml - 3 new tests for
apply_storage_model()(PowerVault→1.0, PowerFlex→2.0, PowerStore→table values); 230 tests passing, ruff and mypy clean
[v1.1] - 2026-02-20
i18n, Branding & Intelligence milestone.
Phase 13: Graphics (COMPLETE)
src/store_predict/services/charts.py— four ECharts option-dict builders:echart_sankey_options,echart_pie_options,echart_drr_bar_options,echart_before_after_options; all use Dell blue#007DB8palette; Sankey falls back to grouped bar when fewer than 2 workload groupssrc/store_predict/services/pdf_charts.py— four ReportLab/matplotlib builders:make_sankey_image_flowable(lazy matplotlib import,Spacerguard for empty data),make_pie_drawing,make_drr_bar_drawing,make_before_after_bar_drawingreport.py—_build_charts_section()added: Sankey full-width, pie + DRR bar in two-column grid, before/after bar full-width; only rendered when workload groups existpdf_report.py— second PDF page added viaPageBreak()+ chart flowables;on_later_pagescallback ensures Dell branded header on page 2matplotlib>=3.8confirmed in runtime dependencies; mypy overrides formatplotlib.*added- 6 i18n keys added across
en.yamlandfr.yaml(pdf.charts_heading,pdf.sankey_title,pdf.pie_title,pdf.drr_bar_title,pdf.before_after_title,report.charts_heading) - 227 tests passing, ruff and mypy clean
Phase 12: UX Polish (COMPLETE)
- Upload page refactored with spinner, linear progress bar, and
run.io_boundpipeline offloading for a responsive event loop during 2-10 second processing - Persistent LLM ui.notification (spinner=True, timeout=None) updated in-place to positive/negative outcome instead of fire-and-forget notify
- Review and report pages upgraded from plain links to card-with-CTA empty states (icon + label + button)
- PDF and Excel download buttons now disable during generation and re-enable via try/finally guard
- Company logo upload error message replaced with
t("error.logo_upload_failed")i18n key - Added 8 new i18n keys across en.yaml and fr.yaml (upload.processing, llm.error, error.unexpected, error.logo_upload_failed)
- Raw exception strings replaced with i18n messages across all user-facing error paths
- All
ui.notify()type values audited to canonical NiceGUI types (positive/negative/warning/info) - 20-test suite in test_ux_polish.py locking in UX patterns; full suite grows to 227 passed, 1 skipped
Phase 11: LLM Classification Fallback (COMPLETE)
- LLMConfig pydantic-settings class reads 6 env vars (LLM_ENABLED/MODEL/API_KEY/API_BASE/TIMEOUT/MAX_CONCURRENT) with SecretStr masking for the API key
classify_unknown_vms_asyncasync function filters only "default" confidence VMs, runs bounded concurrency via asyncio.Semaphore, and logs only counts (never VM names)classify_single_vmapplies input sanitization against prompt injection (truncate vm_name/os_name, strip newlines), asyncio timeout, and circuit breaker (3 failures -> 60s cooldown)- LLM fallback wired into upload pipeline behind
llm_cfg.enabledguard — feature is opt-in viaLLM_ENABLED=trueenv var, never active in CI - User notifications: persistent spinner before LLM pass, count notification after
- docker-compose.yml updated with
env_file(required: false) and LLM_* env var stubs pointing to OpenRouter/Mistral defaults .env.exampleadded and tracked in git as operator onboarding guide- 7-test suite for config and classifier; pydantic-settings added to runtime dependencies
Phase 10: PDF Branding (COMPLETE)
- Dell partner logo PNG bundled as package data and loaded at import time (Docker-safe path resolution)
_preprocess_logo()normalizes any image mode (RGBA/RGB/P/JPEG) to RGBA PNG before ReportLab embedding, preventing black-background palette imagesvalidate_logo()validates PNG/JPEG by extension, magic bytes, file size, and image dimensions, raising IngestionError for user-facing messagesgenerate_report_pdf()extended with backwards-compatibledell_logo_bytesandcompany_logo_byteskwargs- Company logo upload UI:
ui.uploadcard on report page accepting .png/.jpg/.jpeg up to 200 KB with remove button - Logo stored as base64 in
app.storage.tab(tab-scoped session isolation) and decoded on PDF download _on_download()passes decodedcompany_logo_bytestogenerate_report_pdf(), embedding customer logo in PDF header- Pillow added to runtime dependencies; 16 branding tests + 11 logo UI wiring tests; 200 total tests
Phase 9: Excel Export (COMPLETE)
generate_report_xlsx(summary, project_name, locale) -> bytespure function mirroring the PDF service shape (same locale param, same BytesIO pattern)- Three styled sheets: Summary (label-value metrics), Workload Breakdown (category subtotals + totals row), VM Detail (per-VM row with optional performance columns)
- Brand blue (#1e3a5f) header row with white bold text, freeze panes at row 1, autofit columns on all sheets
- Alternate row colouring on body rows; performance columns/rows gated on
has_performance_dataflag - 18 new
excel.*i18n keys in both en.yaml and fr.yaml; EN and FR outputs verified to differ in bytes - Green "Download Excel Report" button wired on report page between PDF and Back buttons
_on_download_excelhandler mirrors_on_download: assert summary type, generate bytes, sanitize filename,ui.download- XlsxWriter mypy override added; 8-test suite validating magic bytes, locale switching, performance guard, and sheet count
Phase 8.1: LiveOptics ZIP Extraction (COMPLETE)
- ZIP accepted as a fourth upload format alongside .xlsx and .csv
extract_liveoptics_from_zip(content: bytes) -> tuple[bytes, str]module finds the LiveOptics xlsx member by case-insensitive regex pattern- Zip bomb guard rejects archives whose total uncompressed bytes exceed 100 MB (central directory header check, no extraction needed)
- ZIP extraction runs before
validate_upload()so extracted xlsx bytes go through existing validation logic unchanged validation.pyextended to accept "zip" extension and PK magic bytes; upload accept prop updated to.xlsx,.csv,.zip- 7-test suite with real in-memory zipfile objects covering happy path, pattern mismatch, no match, invalid zip, multiple members, and bomb guard
- Zero regressions; 165 tests passing after addition
Phase 8: i18n Foundation (COMPLETE)
t()translation helper backed by python-i18n YAML files with%{variable_name}placeholder syntax- Tab-scoped
get_locale()/set_locale()session helpers safe outside NiceGUI context (catches RuntimeError for pytest) - English and French YAML locale files with 73 strings across 8 namespaces (layout, upload, review, report, stats, dialog, columns, pdf)
add_locale_toggle()FR/EN toggle button triggering full page reload (required becauseui.headercannot be in@ui.refreshable)- French is the default locale per project convention; toggle label shows the switch-target language
- All 65 UI-layer strings in 8 files wrapped in
t()calls; no hardcoded labels remain - AG Grid configured with French CDN locale pack (
ag-grid-community/locale@32.2.2) and:localeTextJS binding, injected only when locale is 'fr' - PDF localized:
generate_report_pdf()acceptslocaleparam;_i18n.set('locale', locale)called once before all t() calls - 13-test i18n unit suite covering EN/FR lookup, placeholder substitution, get_locale() safety, and PDF locale correctness
[v1.0] - 2026-02-19
MVP Sizing Tool milestone.
Phase 7: UI Bug Fixes & Report Enhancements (COMPLETE)
- Fixed AG Grid "No Rows To Show" — NiceGUI requires
:prefix for JS function properties - Fixed NaN serialization chain:
NaN → None(not empty string) for JSON compatibility - NiceGUI auto-reload with
__mp_main__guard for multiprocessing - LiveOptics performance columns: Peak IOPS, 8K Eq. IOPS, Peak MB/s (conditional on data)
- 8K IOPS normalization fix:
throughput_KB/s / 8(was double-counting with avg_iops) - Editable DRR column for custom overrides (min 0.1)
- Bulk workload update: select multiple VMs via checkboxes, mass-assign workload category
- Workload dropdown popup (
cellEditorPopup: True) for readable category labels - Filtered select-all: header checkbox selects only visible (filtered) rows
- CPU/memory metrics:
num_cpusandmemory_mibin parsers, calculation, report, and PDF - Report reorganized into Totals and Averages sections (web + PDF)
- Replaced misleading "Total Peak IOPS" with "Hottest VM Peak IOPS" (single VM max)
- WorkloadDialog fixed to accept plain strings (not dicts) for NiceGUI ui.select
- 145 tests passing, 1 skipped
Phase 6: Polish, Docs & Deployment (COMPLETE)
- Docker hardening:
.dockerignore,HEALTHCHECKdirective, env-varSTORAGE_SECRET - Server-side file upload validation with magic-byte checks (XLSX zip header, CSV UTF-8)
- Logging configuration with sanitization guidance (never log DataFrame contents)
- Session isolation verification via
app.storage.tab(tab-scoped) - Performance benchmark tests: 5000 VM classification < 10s, PDF generation < 5s
- MkDocs documentation: architecture page with 3 Mermaid diagrams, getting-started guide
- Project README with Docker and local dev quickstart
- GitHub Actions CI: ruff check, ruff format, mypy, pytest on push/PR to main
- GitHub Actions docs: MkDocs deployment to GitHub Pages on push to main
- 15 new tests (validation + log sanitization + performance), 121 total tests passing
Phase 5: Calculation & PDF Report (COMPLETE)
- Calculation service with per-VM required capacity (
provisioned_mib / drr) - Workload grouping with subtotals (VM count, provisioned, in-use, required per category)
- Weighted average DRR (
total_provisioned / total_required, not simple average) - Division-by-zero guard:
max(drr, 0.1)prevents invalid calculations - Missing field defaults via
.get()for robustness with incomplete data - PDF report generator using ReportLab Platypus with branded one-page layout
- Dark blue header bar with StorePredict branding
- Workload breakdown table in PDF (Category, VMs, Provisioned, Avg DRR, Required)
- Vera/VeraBd TTF fonts for French character support (accents, special chars)
- Storage formatting helper: MiB to GiB with TiB display for large values
- Report page at
/reportwith summary cards and workload breakdown table - PDF download button triggering browser download
- Navigation wiring: Review → Report button, Report link in nav bar
- 24 new tests (12 calculation + 12 PDF), 106 total tests passing
Phase 4: UI — Upload & Review Pages (COMPLETE)
- Session state module for per-tab DataFrame serialization (
ui/state.py) - Upload page with file dropzone, project name input, pipeline integration
- AG Grid VM table component with inline workload dropdown (ADR-007)
- Multi-select workload dialog for assigning multiple workload types (ADR-009)
- Summary statistics cards (Total VMs, Provisioned, Avg DRR, Effective Capacity)
- Review page wiring all components: table, dialog, stats, DRR recalculation
- Dark mode toggle with persistent user preference via
app.storage.user(ADR-008) - Navigation header with Home, Upload, and Review links
- Cell change handler: inline workload dropdown updates DRR and stats
- Row click handler: multi-select dialog applies conservative (lowest) DRR
- Per-tab session isolation for upload data, per-user storage for preferences (ADR-008)
Phase 3: Workload Classification Engine
- Classification engine with 29 priority-ordered rules covering all 28 DRR subcategories
- ClassificationRule dataclass with pattern matching on VM name and OS field
- RuleRegistry with first-match-wins evaluation and confidence tracking
- Substring matching (CADSRVSQL001 -> SQL) with false positive prevention
- OS-based fallback rules (Windows Server -> Virtual Machines)
- classify_dataframe() for bulk DataFrame classification
- 0% Unknown rate on 594 real LiveOptics VMs (target was <20%)
- 28 unit tests + 11 integration tests with real sample data
Phase 2: File Ingestion Pipeline
- RVTools .xlsx parser (vInfo tab)
- LiveOptics .xlsx and .csv parsers (VMs tab)
- Format auto-detection based on sheet names and column headers
- Column alias resolution for name variations
- Template VM filtering
- Unified ingest_file() orchestrator
- 29 ingestion tests with real sample files
Phase 1: Project Foundation & DRR Table
- Python project structure with src layout
- DRR table service loading 28 workload categories from CSV
- Data models (VM, FileFormat, WorkloadCategory)
- NiceGUI app skeleton with page routing
- ruff + mypy configuration
- pytest setup with 14 initial tests
- Dockerfile + docker-compose.yml