ADR-055: Multi-dimensional BFD heuristic for datastore layout engine
Date: 2026-02-21 Status: Accepted
Context
StorePredict v3.0 adds a datastore layout recommendation engine. Given classified VMs with capacity, IOPS, and workload data, the engine must assign VMs to datastores while respecting three simultaneous constraints: capacity, IOPS budget, and VM count per datastore. This is a multi-dimensional bin packing problem (NP-hard).
Two families of approaches were considered:
| Approach | Pros | Cons |
|---|---|---|
| ILP solver (PuLP, OR-Tools) | Provably optimal | Heavy dependency (50+ MB), slow for >500 VMs, complex to maintain |
| BFD heuristic (Best Fit Decreasing) | Pure Python, fast (<2s for 1,000 VMs), within 10-15% of optimal | Not provably optimal |
Decision
Use multi-dimensional Best Fit Decreasing (BFD) with normalized scoring as the placement algorithm for all three strategies (Consolidation, Performance, Uniform).
The algorithm:
- Normalize each VM as
max(capacity_ratio, iops_ratio)where ratios are relative to per-datastore limits - Sort VMs descending by this score (largest/most-demanding first)
- For each VM, place in the datastore with the least remaining capacity that still satisfies all three constraints (capacity + IOPS + VM count)
- When no datastore fits, create a new one
For the Uniform strategy, a variant (LPT — Longest Processing Time) is used instead: pre-compute the number of datastores, then assign each VM to the least-loaded datastore for balanced utilization.
Consequences
- No new dependencies: Pure Python dataclasses + list operations. No numpy, PuLP, or OR-Tools. Keeps Docker image small and CI fast.
- Performance: O(n × m) where n = VMs and m = datastores. For 1,000 VMs and ~40 datastores, this is ~40,000 comparisons — trivial.
- Quality: BFD is well-studied and consistently produces solutions within 11/9 × OPT for single-dimension packing. Multi-dimensional quality is typically within 10-15% of ILP optimal for real-world VM distributions.
- Extensibility: Adding a fourth strategy (e.g., workload-segregated with
affinity rules) only requires a new function that calls the same
_bfd_place()core with different constraints. - Trade-off accepted: We sacrifice provably optimal packing for zero-dependency simplicity and sub-second execution. For pre-sales sizing (not production automation), 10-15% suboptimality is acceptable.