Skip to content

Technical Research

Research documents produced during project development. Each document captures domain analysis, technology decisions, and findings from investigating sample data.

Phase Research

Phase Topic Key Findings
Phase 1 Project Foundation Python stack, NiceGUI, DRR.csv parsing quirks
Phase 2 File Ingestion RVTools "MB" = MiB, Template NaN pitfall, column aliases
Phase 3 Workload Classification 28 DRR categories, false positive patterns (ORA/SAP/EX), OS fallback strategy
Phase 4 UI Upload & Review Pages AG Grid table, session storage split, multi-workload dialog, dark mode
Phase 5 Calculation & PDF Report ReportLab Platypus, Vera fonts, weighted avg DRR, BytesIO PDF
Phase 6 Polish, Docs & Deployment Docker hardening, file validation, CI/CD, MkDocs, performance tests
Phase 7 UI Enhancements AG Grid advanced features, NiceGUI component patterns, session state
Phase 8 i18n Foundation python-i18n global locale, AG Grid FR locale CDN, ReportLab CID encoding
Phase 8.1 LiveOptics ZIP ZIP bomb guard via central directory sum, tuple return pattern
Phase 9 Excel Export XlsxWriter write-only, BytesIO seek(0), _i18n.t() vs t() wrapper
Phase 10 PDF Branding Pillow mode normalization, Docker-safe Path resolution, base64 decode guard
Phase 11 LLM Classification litellm.acompletion() async, pydantic-settings SecretStr, circuit breaker pattern
Phase 12 UX Polish run.io_bound for spinner rendering, ui.notification in-place update, button disable/enable
Phase 13 Graphics & Charts ui.echart zero-dep, matplotlib lazy import, Spacer guard for empty PDF image, onLaterPages for page 2 header
Phase 14 Application-Level DRR Variants App compression halves DRR (Oracle HCC → 2.5), encryption defeats dedup (TDE → 1.5, combined → 1.2), DDVE stores already-deduplicated data (DRR = 1.0)
Phase 15 Default IOPS Estimates Workload-based IOPS for RVTools imports, conservative peak values, CSV configurability

Sample Data Analysis

  • RVTools sample: 24 VMs, 70 columns in vInfo tab
  • LiveOptics sample: 610 VMs, 38 columns in VMs tab
  • DRR reference: 42 valid entries (28 base + 14 encrypted/compressed variants), semicolon-delimited CSV with parsing quirks

Key Technical Findingsxxx-

RVTools "MB" values are MiB

Despite column headers saying "MB", RVTools uses base-2 (MiB) values. No unit conversion needed between RVTools and LiveOptics formats.

VM Naming Conventions

Corporate VM names embed functional keywords: CADSRVSQL001 (SQL), CITADM-01 (Citrix), xxx-FAZ (FortiAnalyzer). Classification relies on substring matching against these patterns.

False Positive Patterns

  • "ORA" matches LORADB (LoRa radio protocol database) — use "ORACLE" instead
  • "SAP" matches GISAPP (GIS application server) — use word-boundary match
  • "EX" matches EXTRANET — use "EXCHANGE" instead
  • "ABAC" is Swiss Abacus ERP, not SAP ABAP

v1.1 Findings

python-i18n Global State

python-i18n stores locale as process-global state. In a multi-tab NiceGUI app, switching locale in one tab affects all tabs. Solution: always call i18n.set('locale', ...) at the start of each request handler and store the user's locale choice in app.storage.tab.

run.io_bound for NiceGUI Spinner Rendering

NiceGUI's ui.spinner only renders when the event loop yields. Long synchronous operations (file parsing, PDF generation) must be wrapped in await run.io_bound(fn, *args) so the spinner actually appears during processing rather than only after completion.

litellm async Pattern

litellm.acompletion() is the async entry point for LLM calls. Use asyncio.wait_for() with a timeout to enforce circuit-breaker behaviour. Always set SecretStr via pydantic-settings so API keys are never logged or exposed in repr output.

ReportLab CID Font Encoding

ReportLab encodes text strings via CIDFont/FlateDecode, making them non-searchable as raw bytes in the PDF bytestream. Test locale-specific PDF content by comparing FR output bytes != EN output bytes rather than checking for string presence.

ZIP Bomb Guard

For LiveOptics ZIP ingestion, sum all file.file_size entries from the ZIP central directory before extracting. Reject any archive where the total uncompressed size exceeds the configured limit (default 512 MB). This prevents decompression bombs without requiring full extraction.