Presentations and Talks
Primary deck: godotz.ai Architecture — From Single Agent to Heterogeneous Fleet
Source material: HARNESS-SPEC v3.1, godotz.ai Technical Whitepaper, PRD v1.2
Overview
The godotz.ai architecture presentation deck translates the HARNESS-SPEC and whitepaper into a talk-ready format. It is structured for a technical audience familiar with LLM tooling but unfamiliar with multi-agent fleet orchestration. The deck progresses from problem framing through architecture layers to live benchmarks, ending with the roadmap.
Total slide count: 28 slides across 6 sections.
Deck Structure
| Section | Slides | Title |
|---|---|---|
| 1 | 1–3 | The Problem: Single-Agent Limits |
| 2 | 4–7 | godotz.ai Architecture: 13 Components |
| 3 | 8–13 | L0–L6 Layer Model |
| 4 | 14–19 | Model Matrix and Routing |
| 5 | 20–24 | Fleet Topology |
| 6 | 25–28 | Benchmarks and Roadmap |
Key Slides
Section 1 — The Problem (Slides 1–3)
Slide 1: Four Failure Modes
Single-model echo chambers, no fleet orchestration primitive, opaque cost governance, unsafe self-modification loops. Each gets a one-sentence diagnosis and a data point (e.g., ReConcile paper: heterogeneous panels reduce bias by ~31%).
Slide 2: What Existing Tools Miss
Side-by-side: Claude Code / Cursor / Codex each have millions of users, none have cross-fleet coordination. LiteLLM has the gateway layer but lacks orchestration. Temporal has durable execution but is rarely applied to agent DAGs.
Slide 3: godotz.ai’s Answer
One sentence per failure mode, mapping to the harness layer that addresses it.
Section 2 — 13-Component Architecture (Slides 4–7)
The 13 core components of godotz.ai, grouped into four columns:
| Column | Components |
|---|---|
| Conversation | Claude Code (Claude, Gemini), godotz.ai TUI |
| Orchestration | OMC multi-agent layer, Swarm executor, Beads DAG scheduler |
| Gateway | LiteLLM proxy, Budget enforcer, Semantic cache |
| Infrastructure | Mnemopi memory, Knowledge Gardener vault, Graphify KG, Temporal durable execution, Redis |
Slide 4: Component map diagram — boxes with arrows showing data flow from user prompt through conversation layer to worker models and back through memory.
Slide 5: Zoom on the Orchestration column — OMC hook lifecycle (SessionStart → UserPromptSubmit → PreToolUse → PostToolUse → Stop), agent roster (32 agents, 8 roles).
Slide 6: Zoom on the Gateway column — LiteLLM routing with per-key budget enforcement, semantic cache hit/miss path, plan cache bypass.
Slide 7: Zoom on the Infrastructure column — Mnemopi (SQLite + WAL, per-project-tagged), Graphify (239 nodes, 284 edges, 22 communities), Knowledge Gardener (vault + auto-recap).
Section 3 — L0–L6 Layer Model (Slides 8–13)
godotz.ai’s architecture is formally described as a six-layer stack (L0–L6), distinct from the 12-layer optimization stack. The layer model describes what each level handles; the optimization stack describes how it is tuned.
| Layer | Name | Responsibility |
|---|---|---|
| L0 | Substrate | OS provisioning, kernel tuning, hardware profile (i9-12900K, RTX 3080, 30Gi RAM) |
| L1 | Gateway | LiteLLM proxy, API key management, budget enforcement, provider routing |
| L2 | Orchestration | OMC hook system, Claude Code plugin layer, skill injection, keyword detection |
| L3 | Execution | Beads DAG scheduler, Temporal durable tasks, swarm worker dispatch |
| L4 | Intelligence | Model routing (cascade + pheromone), semantic cache, plan cache, EMA tracker |
| L5 | Memory | Mnemopi session memory, Graphify knowledge graph, Knowledge Gardener vault |
| L6 | Knowledge | Context hydration (omp-hydrate), context engineering scoring, session recap |
Slide 8: Layer stack diagram — vertical stack with L0 at bottom, L6 at top. Arrows show upward data flow (substrate → knowledge) and downward control flow (knowledge → substrate via routing decisions).
Slides 9–13: One slide per layer pair (L0+L1, L2+L3, L4, L5, L6). Each slide: layer name, one-line responsibility, key components, and one measurable property (e.g., L4: pheromone routing 16.8ms, semantic cache 6.1ms).
Section 4 — Model Matrix (Slides 14–19)
Slide 14: Role-to-Model Matrix
| Role | Model | Provider | Use case |
|---|---|---|---|
| slow | claude-opus-4-6-thinking:high | Antigravity | Architecture, deep analysis |
| default | claude-sonnet-4-6-thinking:high | Antigravity | General conversation |
| smol | gemini-3.1-pro-high:high | Antigravity | Fast lookups |
| plan | claude-opus-4-6-thinking:high | Antigravity | Strategic planning |
| task | glm-5.1:xhigh | z.ai | Subagent execution (10 slots) |
| commit | glm-4.7:xhigh | z.ai | Commit messages (3 slots) |
| designer | glm-4.5-air:xhigh | z.ai | UI/design tasks (1 slot) |
Slide 15: Concurrency Limits
Total concurrent slots: 18. GLM concurrency caps: glm-5.1 (10), glm-4.7 (2), glm-4.5-air (5), glm-5-turbo (1). Why 18: z.ai account-level concurrent request limit.
Slide 16: Cascade Escalation Chain
Diagram: glm-4.5-air → glm-4.7 → glm-5.1 → Antigravity fallback. Threshold: confidence < 0.7 triggers escalation. Cost gradient: cheapest first.
Slides 17–19: Pheromone routing mechanics (ant-colony metaphor, evaporation rate 0.05, slot rebalancing), EMA feedback loop (alpha=0.3, routing signal thresholds), and the haiku/sonnet/opus semantic mapping to GLM tiers.
Section 5 — Fleet Topology (Slides 20–24)
Slide 20: Single-Node Reference Topology
The current deployment is a single Linux workstation acting as a full fleet node. Diagram shows the node with all components co-located: Claude Code TUI, OMC plugins, godotz.ai daemon, LiteLLM proxy, Redis, Temporal worker, Graphify daemon.
Slide 21: Multi-Node Fleet Extension
How the topology scales: each node carries its own HARNESS-SPEC baseline. The daemon roster (~/.claude/daemon/roster.json) and dispatch directory enable cross-node job routing. Herdr integration (herdr-agent-state.sh) reports agent state to a central coordinator via Unix socket.
Slide 22: Swarm Workload Topology
Two workload configurations from swarm.yaml and swarm-advisor-executor.yaml:
- Parallel audit (3 agents, all concurrent): analyzer + tester + security → reports
- Advisor-executor DAG (5 agents, phased): advisor → [simple/medium/complex workers] → verifier
Slide 23: Model Provider Topology
Two provider tiers in modelProviderOrder: [google-antigravity, zai]. Priority order with fallback chains. Independent swarm CLI restricted to z.ai (Antigravity requires TUI IPC auth broker).
Slide 24: GPU Node Topology
RTX 3080 hosts lfm2-700m:gpu — 4-bit quantized ONNX, offline thinking model. Excluded from subagent dispatch (preservation mode). Role in topology: local reasoning fallback when API latency is unacceptable.
Section 6 — Benchmarks and Roadmap (Slides 25–28)
Slide 25: S-grade benchmark table (8 metrics) — hydration token reduction (95.2%), tool start latency (12ms), pheromone dispatch (16.8ms), semantic cache hit (6.1ms), regression suite (9ms), cascade verification, budget enforcement, plan cache (46% reduction).
Slide 26: A-grade metrics and what improves them — context promotion routing (needs more scoring data), EMA accuracy (needs n≥50 for stable convergence).
Slide 27: Known limitations — 8 items from HARNESS-SPEC Section 25. Presented as honest engineering tradeoffs, not blockers.
Slide 28: Roadmap — 7 items with status. Highlighted: Adaptive Router (partial, cascade + pheromone in place, needs task dispatch integration) and Cross-Session Intelligence (partial, Mnemopi + Graphify in place, needs hindsight server).
Customization
Swapping the project example: Slides 20–22 use omp-playground (18-file TypeScript project). Replace with any project that has a graphify-out/ directory — update the node count, edge count, community count, and swarm report filenames.
Adjusting benchmark claims: All values in Section 6 come directly from HARNESS-SPEC Section 23. If running on different hardware or a different codebase scale, re-run the optimization scripts and update the source table. The presentation pulls values from there.
Audience depth tuning:
- Executive (30 min): Sections 1, 2 (overview only), 6 (benchmarks only)
- Engineering (60 min): Full deck, pause on Sections 3–4 for Q&A
- Workshop (90 min): Full deck + live
omp-hydrateandomp-ema-tracker suggestdemos
Export Formats
| Format | Use case | Notes |
|---|---|---|
| PDF (16:9) | Async sharing, archival | Export at 1920×1080; embed fonts |
| Keynote / PowerPoint | Live delivery | Keep ASCII diagrams as code blocks rendered in monospace |
| HTML slides (reveal.js) | Web publishing | Use dark theme to match godotz.ai HUD aesthetic |
| Markdown (this site) | Documentation integration | Deck sections map 1:1 to wiki sections |
Code block diagrams (the L0–L6 stack, the swarm DAG, the routing flow) are rendered in pre/code blocks and survive export to all formats without SVG dependency. Do not convert them to images unless the presentation tool cannot render monospace correctly.