Hardcore Mode
Hardcore Mode (omp.hardcore: true) is godotz.ai’s maximum-constraint operating posture. It disables all fallbacks, rejects unverified plugins, enforces hard budget limits, and tightens every concurrency and timeout knob. Use it for production deployments where cost predictability and security matter more than convenience.
1. The Toggle Script
The omp-hardcore script switches the fleet between normal and hardcore posture with a single command.
# Enable hardcore mode
omp-hardcore on
# Disable hardcore mode (returns to defaults)
omp-hardcore off
# Show current status
omp-hardcore status
Example status output:
OMP Hardcore Mode: ENABLED
Since: 2025-06-07T14:23:00Z
Active Constraints:
[✓] No model fallback
[✓] Hard budget cap
[✓] Plugin signature required
[✓] Sandbox enforced
[✓] Context promotion blocked
[✓] EMA tracking active
[✓] Rate limit: strict
[✓] Retry limit: 1
[✓] Telemetry: verbose
[✓] Unverified MCP: blocked
Budget (current cycle):
Spent: $12.43 / $50.00
Remaining: $37.57
Reset: 2025-06-30
What the Script Does
#!/usr/bin/env bash
# omp-hardcore — toggle OMP hardcore mode
# Usage: omp-hardcore on | off | status
set -euo pipefail
CONFIG="/etc/omp/fleet.yml"
STATE_FILE="/var/lib/omp/hardcore.state"
case "${1:-}" in
on)
yq -i '.omp.hardcore = true' "$CONFIG"
echo "$(date -Iseconds)" > "$STATE_FILE"
docker compose exec litellm kill -HUP 1 # reload config
omp agent reload --all
echo "Hardcore mode: ON"
;;
off)
yq -i '.omp.hardcore = false' "$CONFIG"
rm -f "$STATE_FILE"
docker compose exec litellm kill -HUP 1
omp agent reload --all
echo "Hardcore mode: OFF"
;;
status)
if [[ -f "$STATE_FILE" ]]; then
echo "ENABLED since $(cat "$STATE_FILE")"
else
echo "DISABLED"
fi
;;
*)
echo "Usage: omp-hardcore on | off | status" >&2
exit 1
;;
esac
2. The 10 Optimization Categories
Hardcore Mode tightens ten independent dimensions. Each can also be configured individually in fleet.yml.
1. No Model Fallback
model_fallback: false
In normal mode, a failed claude-opus-4-6 call falls back to claude-sonnet-4-6. In hardcore mode, it fails immediately. Forces callers to declare intent explicitly.
2. Hard Budget Cap
budget:
hard_cap: true
action_on_exceed: reject # not warn
Requests are rejected (HTTP 429) the moment a virtual key’s budget is exhausted. No grace period.
3. Plugin Signature Required
plugin_eval:
require_signature: true
block_unsigned: true
Unsigned plugins do not run, period. Not even for testing.
4. Sandbox Always Enforced
sandbox:
enabled: true
bypass_allowed: false
No sandbox bypass flag is accepted. Agents cannot request elevated filesystem or network access.
5. Context Promotion Blocked
context_promotion:
enabled: false
Workers cannot escalate to Antigravity models. If the task is too complex for GLM, it fails. Forces task decomposition at the design level.
6. EMA Tracking Active
ema_tracking:
enabled: true
alpha: 0.2 # smoothing factor; lower = slower decay
window: 100 # requests in the EMA window
EMA (Exponential Moving Average) of success rate, latency, and cost per model is recorded continuously. Used to detect silent degradation before it becomes an incident.
7. Strict Rate Limiting
rate_limits:
strategy: strict # vs. adaptive in normal mode
queue_overflow: reject # vs. queue indefinitely
request_timeout: 30s # vs. 120s in normal mode
8. Retry Limit: 1
retries:
max: 1 # normal mode: 3
backoff: fixed # no exponential backoff
One retry on transient error, then hard failure. Prevents runaway cost from retry storms.
9. Verbose Telemetry
telemetry:
level: verbose
include_prompt_tokens: true
include_cost: true
langfuse_flush_interval: 5s # vs. 30s in normal mode
Every token and cent is tracked and immediately flushed to Langfuse. Verbose mode makes cost attribution exact.
10. Unverified MCP Blocked
mcp:
require_cve_clear: true
block_unscanned: true
Any MCP server not cleared by mcp-scan is blocked from registration. This includes newly added servers until they pass the gate.
3. When to Use vs Normal Mode
| Scenario | Use Hardcore? |
|---|---|
| Production fleet, paid API budgets | Yes |
| Security-sensitive data in context | Yes |
| Cost spike investigation | Yes |
| Local development / experimentation | No |
| Testing a new plugin or MCP server | No |
| Reproducing a flaky agent behavior | No |
| Onboarding / initial setup | No |
The key question: can this run cost you unexpectedly or expose sensitive data? If yes, use hardcore mode.
4. EMA Tracking
Hardcore mode enables continuous EMA tracking of model performance metrics. This feeds the fallback chain and budget reforecasting.
# View current EMA stats
omp ema status
# Output:
# Model EMA Success% EMA P95 Latency EMA Cost/1K tokens
# claude-opus-4-6 98.2% 4.3s $0.085
# claude-sonnet-4-6 99.1% 1.8s $0.018
# glm-5.1 96.4% 2.1s $0.003
# glm-4.5-air 97.8% 0.9s $0.001
EMA updates after every request in hardcore mode (vs. every 10 requests in normal mode). This gives near-real-time health visibility.
Configure alerting thresholds:
# config/ema-alerts.yml
ema_alerts:
success_rate_min: 0.90 # alert if EMA drops below 90%
p95_latency_max: 10s # alert if P95 exceeds 10s
cost_spike_factor: 2.0 # alert if cost EMA doubles
notify: ntfy
5. Budget Enforcement Flow
Incoming API request (worker virtual key)
↓
[Budget Check] ─── Exceeded? ──→ HTTP 429 + ntfy alert
↓ OK
[EMA Update] ─── track tokens + cost
↓
[Model Route] ─── no fallback in hardcore
↓
[Provider Call]
↓
[Langfuse Flush] ─── immediate in hardcore (5s interval)
6. Disabling Individual Constraints
If you need most hardcore constraints but want to allow one exception (e.g., a single trusted unsigned plugin), override only that key:
# fleet.yml
omp:
hardcore: true
# Override: allow this one unsigned plugin
plugin_eval:
unsigned_allowlist:
- "omp-team/internal-debug-plugin@sha256:abc123"
Allowlist entries require a SHA256 pin. Version ranges are not accepted in hardcore mode.
Next Steps
- Security Gates — Understand what each gate enforces
- Model Routing — EMA feeds the routing decision
- Fleet Setup — Apply hardcore mode at boot via
.env.secrets