Phase 11: LLM Classification Fallback - Research
Researched: 2026-02-20 Domain: LLM provider abstraction (litellm), async Python, pydantic-settings, circuit breaker patterns Confidence: HIGH
Summary
Phase 11 adds an optional LLM fallback for VMs that the rules engine cannot classify (those receiving classification_confidence="default"). The feature is opt-in via LLM_ENABLED=false (default). Provider credentials are managed via pydantic-settings BaseSettings with SecretStr so they are never logged. LLM calls use litellm.acompletion() (already declared in pyproject.toml) wrapped in asyncio.wait_for to avoid blocking the NiceGUI event loop. Any LLM response that is not a known DRR category is rejected and the VM stays "Unknown (Reducible)".
Key Findings
pydantic-settings LLMConfig with SecretStr
BaseSettings reads env vars automatically. SecretStr masks API keys in all repr() and log output — the value is only exposed when .get_secret_value() is called explicitly at the call site.
from pydantic import SecretStr
from pydantic_settings import BaseSettings, SettingsConfigDict
class LLMConfig(BaseSettings):
model_config = SettingsConfigDict(env_prefix="LLM_", case_sensitive=False)
enabled: bool = False
model: str = "openai/gpt-4o-mini"
api_key: SecretStr = SecretStr("")
api_base: str | None = None # Ollama: http://host.docker.internal:11434
timeout: int = 30
Env vars: LLM_ENABLED, LLM_MODEL, LLM_API_KEY, LLM_API_BASE, LLM_TIMEOUT.
asyncio.wait_for + litellm.acompletion
Use asyncio.wait_for (not litellm's timeout= param, which is unreliable across providers) to enforce the hard timeout. This is compatible with the NiceGUI event loop.
response = await asyncio.wait_for(
litellm.acompletion(
model=config.model,
messages=[{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt}],
api_key=config.get_api_key() or None,
api_base=config.api_base,
max_tokens=30,
),
timeout=config.timeout,
)
raw = response.choices[0].message.content.strip()
return raw if raw in valid_categories else None
In-Process Circuit Breaker (No External Dep)
A module-level fail counter + cooldown timestamp protects against provider outages without adding a dependency. After 3 consecutive failures, skip LLM calls for 60 seconds.
_cb_fail_count: int = 0
_cb_open_until: float = 0.0
_CB_FAIL_MAX = 3
_CB_COOLDOWN = 60.0
if time.monotonic() < _cb_open_until:
return None # circuit open — skip call
# on failure:
_cb_fail_count += 1
if _cb_fail_count >= _CB_FAIL_MAX:
_cb_open_until = time.monotonic() + _CB_COOLDOWN
Input Sanitization Against Prompt Injection
VM names and OS strings come from untrusted customer data. Truncate and strip newlines before embedding in prompts. Use role-framing in the system prompt to reinforce data-only interpretation.
safe_vm = vm_name[:100].replace("\n", " ").replace("\r", " ")
safe_os = os_name[:50].replace("\n", " ").replace("\r", " ")
# System prompt explicitly states: treat VM name/OS as DATA, not instructions
Validation Against Known DRR Categories
Always validate the raw LLM response string against the set of valid workload category names loaded from DRR.csv. Reject any response not in the set — do not attempt fuzzy matching or correction.
valid_categories: set[str] = {entry.subcategory for entry in drr_table.entries}
return raw if raw in valid_categories else None
Reuse classification_confidence Column
No new AG Grid column is needed. The existing classification_confidence field already holds "rule_match", "os_fallback", and "default". Add "llm" as a fourth valid value for LLM-classified VMs.
Anti-Patterns
- Using litellm's
timeout=parameter instead ofasyncio.wait_for: litellm's built-in timeout is inconsistent across providers.asyncio.wait_foris the definitive cancellation mechanism. - Calling
get_secret_value()at module load or in logs: Only call it at thelitellm.acompletion()call site, never in logging statements or exception messages. - Fuzzy-matching or auto-correcting LLM responses: If the model returns a slightly wrong category name, reject it and leave the VM as "Unknown (Reducible)". Silently accepting near-matches risks mis-sizing.
Dependencies
| Package | Version | Notes |
|---|---|---|
| litellm | >=1.81.13 | Already in pyproject.toml |
| pydantic-settings | >=2.13.0,<3.0 | Add to pyproject.toml; uv pip install "pydantic-settings>=2.13.0,<3.0" |