Presentations and Talks

Primary deck: godotz.ai Architecture — From Single Agent to Heterogeneous Fleet
Source material: HARNESS-SPEC v3.1, godotz.ai Technical Whitepaper, PRD v1.2


Overview

The godotz.ai architecture presentation deck translates the HARNESS-SPEC and whitepaper into a talk-ready format. It is structured for a technical audience familiar with LLM tooling but unfamiliar with multi-agent fleet orchestration. The deck progresses from problem framing through architecture layers to live benchmarks, ending with the roadmap.

Total slide count: 28 slides across 6 sections.


Deck Structure

SectionSlidesTitle
11–3The Problem: Single-Agent Limits
24–7godotz.ai Architecture: 13 Components
38–13L0–L6 Layer Model
414–19Model Matrix and Routing
520–24Fleet Topology
625–28Benchmarks and Roadmap

Key Slides

Section 1 — The Problem (Slides 1–3)

Slide 1: Four Failure Modes
Single-model echo chambers, no fleet orchestration primitive, opaque cost governance, unsafe self-modification loops. Each gets a one-sentence diagnosis and a data point (e.g., ReConcile paper: heterogeneous panels reduce bias by ~31%).

Slide 2: What Existing Tools Miss
Side-by-side: Claude Code / Cursor / Codex each have millions of users, none have cross-fleet coordination. LiteLLM has the gateway layer but lacks orchestration. Temporal has durable execution but is rarely applied to agent DAGs.

Slide 3: godotz.ai’s Answer
One sentence per failure mode, mapping to the harness layer that addresses it.


Section 2 — 13-Component Architecture (Slides 4–7)

The 13 core components of godotz.ai, grouped into four columns:

ColumnComponents
ConversationClaude Code (Claude, Gemini), godotz.ai TUI
OrchestrationOMC multi-agent layer, Swarm executor, Beads DAG scheduler
GatewayLiteLLM proxy, Budget enforcer, Semantic cache
InfrastructureMnemopi memory, Knowledge Gardener vault, Graphify KG, Temporal durable execution, Redis

Slide 4: Component map diagram — boxes with arrows showing data flow from user prompt through conversation layer to worker models and back through memory.

Slide 5: Zoom on the Orchestration column — OMC hook lifecycle (SessionStart → UserPromptSubmit → PreToolUse → PostToolUse → Stop), agent roster (32 agents, 8 roles).

Slide 6: Zoom on the Gateway column — LiteLLM routing with per-key budget enforcement, semantic cache hit/miss path, plan cache bypass.

Slide 7: Zoom on the Infrastructure column — Mnemopi (SQLite + WAL, per-project-tagged), Graphify (239 nodes, 284 edges, 22 communities), Knowledge Gardener (vault + auto-recap).


Section 3 — L0–L6 Layer Model (Slides 8–13)

godotz.ai’s architecture is formally described as a six-layer stack (L0–L6), distinct from the 12-layer optimization stack. The layer model describes what each level handles; the optimization stack describes how it is tuned.

LayerNameResponsibility
L0SubstrateOS provisioning, kernel tuning, hardware profile (i9-12900K, RTX 3080, 30Gi RAM)
L1GatewayLiteLLM proxy, API key management, budget enforcement, provider routing
L2OrchestrationOMC hook system, Claude Code plugin layer, skill injection, keyword detection
L3ExecutionBeads DAG scheduler, Temporal durable tasks, swarm worker dispatch
L4IntelligenceModel routing (cascade + pheromone), semantic cache, plan cache, EMA tracker
L5MemoryMnemopi session memory, Graphify knowledge graph, Knowledge Gardener vault
L6KnowledgeContext hydration (omp-hydrate), context engineering scoring, session recap

Slide 8: Layer stack diagram — vertical stack with L0 at bottom, L6 at top. Arrows show upward data flow (substrate → knowledge) and downward control flow (knowledge → substrate via routing decisions).

Slides 9–13: One slide per layer pair (L0+L1, L2+L3, L4, L5, L6). Each slide: layer name, one-line responsibility, key components, and one measurable property (e.g., L4: pheromone routing 16.8ms, semantic cache 6.1ms).


Section 4 — Model Matrix (Slides 14–19)

Slide 14: Role-to-Model Matrix

RoleModelProviderUse case
slowclaude-opus-4-6-thinking:highAntigravityArchitecture, deep analysis
defaultclaude-sonnet-4-6-thinking:highAntigravityGeneral conversation
smolgemini-3.1-pro-high:highAntigravityFast lookups
planclaude-opus-4-6-thinking:highAntigravityStrategic planning
taskglm-5.1:xhighz.aiSubagent execution (10 slots)
commitglm-4.7:xhighz.aiCommit messages (3 slots)
designerglm-4.5-air:xhighz.aiUI/design tasks (1 slot)

Slide 15: Concurrency Limits
Total concurrent slots: 18. GLM concurrency caps: glm-5.1 (10), glm-4.7 (2), glm-4.5-air (5), glm-5-turbo (1). Why 18: z.ai account-level concurrent request limit.

Slide 16: Cascade Escalation Chain
Diagram: glm-4.5-air → glm-4.7 → glm-5.1 → Antigravity fallback. Threshold: confidence < 0.7 triggers escalation. Cost gradient: cheapest first.

Slides 17–19: Pheromone routing mechanics (ant-colony metaphor, evaporation rate 0.05, slot rebalancing), EMA feedback loop (alpha=0.3, routing signal thresholds), and the haiku/sonnet/opus semantic mapping to GLM tiers.


Section 5 — Fleet Topology (Slides 20–24)

Slide 20: Single-Node Reference Topology
The current deployment is a single Linux workstation acting as a full fleet node. Diagram shows the node with all components co-located: Claude Code TUI, OMC plugins, godotz.ai daemon, LiteLLM proxy, Redis, Temporal worker, Graphify daemon.

Slide 21: Multi-Node Fleet Extension
How the topology scales: each node carries its own HARNESS-SPEC baseline. The daemon roster (~/.claude/daemon/roster.json) and dispatch directory enable cross-node job routing. Herdr integration (herdr-agent-state.sh) reports agent state to a central coordinator via Unix socket.

Slide 22: Swarm Workload Topology
Two workload configurations from swarm.yaml and swarm-advisor-executor.yaml:

  • Parallel audit (3 agents, all concurrent): analyzer + tester + security → reports
  • Advisor-executor DAG (5 agents, phased): advisor → [simple/medium/complex workers] → verifier

Slide 23: Model Provider Topology
Two provider tiers in modelProviderOrder: [google-antigravity, zai]. Priority order with fallback chains. Independent swarm CLI restricted to z.ai (Antigravity requires TUI IPC auth broker).

Slide 24: GPU Node Topology
RTX 3080 hosts lfm2-700m:gpu — 4-bit quantized ONNX, offline thinking model. Excluded from subagent dispatch (preservation mode). Role in topology: local reasoning fallback when API latency is unacceptable.


Section 6 — Benchmarks and Roadmap (Slides 25–28)

Slide 25: S-grade benchmark table (8 metrics) — hydration token reduction (95.2%), tool start latency (12ms), pheromone dispatch (16.8ms), semantic cache hit (6.1ms), regression suite (9ms), cascade verification, budget enforcement, plan cache (46% reduction).

Slide 26: A-grade metrics and what improves them — context promotion routing (needs more scoring data), EMA accuracy (needs n≥50 for stable convergence).

Slide 27: Known limitations — 8 items from HARNESS-SPEC Section 25. Presented as honest engineering tradeoffs, not blockers.

Slide 28: Roadmap — 7 items with status. Highlighted: Adaptive Router (partial, cascade + pheromone in place, needs task dispatch integration) and Cross-Session Intelligence (partial, Mnemopi + Graphify in place, needs hindsight server).


Customization

Swapping the project example: Slides 20–22 use omp-playground (18-file TypeScript project). Replace with any project that has a graphify-out/ directory — update the node count, edge count, community count, and swarm report filenames.

Adjusting benchmark claims: All values in Section 6 come directly from HARNESS-SPEC Section 23. If running on different hardware or a different codebase scale, re-run the optimization scripts and update the source table. The presentation pulls values from there.

Audience depth tuning:

  • Executive (30 min): Sections 1, 2 (overview only), 6 (benchmarks only)
  • Engineering (60 min): Full deck, pause on Sections 3–4 for Q&A
  • Workshop (90 min): Full deck + live omp-hydrate and omp-ema-tracker suggest demos

Export Formats

FormatUse caseNotes
PDF (16:9)Async sharing, archivalExport at 1920×1080; embed fonts
Keynote / PowerPointLive deliveryKeep ASCII diagrams as code blocks rendered in monospace
HTML slides (reveal.js)Web publishingUse dark theme to match godotz.ai HUD aesthetic
Markdown (this site)Documentation integrationDeck sections map 1:1 to wiki sections

Code block diagrams (the L0–L6 stack, the swarm DAG, the routing flow) are rendered in pre/code blocks and survive export to all formats without SVG dependency. Do not convert them to images unless the presentation tool cannot render monospace correctly.