Orchestration
godotz.ai orchestrates agents through Swarm YAML files. A swarm is a declarative description of agents, their roles, model assignments, dependencies, and resource limits. The OMC runtime reads the YAML and coordinates execution without requiring agents to know about each other.
Swarm YAML Structure
# swarm.yaml
version: "1"
name: code-review-swarm
# Global defaults inherited by all agents unless overridden
defaults:
maxConcurrency: 4
timeoutSeconds: 300
model: omp/worker
isolation: rcopy
agents:
- id: planner
role: orchestrator
model: omp/orchestrator # routes to claude-opus-4-6
prompt: prompts/planner.md
outputs:
- plan.json
- id: implementer
role: worker
model: omp/worker # routes to glm-5.1 / glm-4.7
prompt: prompts/implementer.md
deps: [planner]
inputs:
- plan.json
maxConcurrency: 3 # up to 3 parallel implementer instances
- id: critic
role: critic
model: omp/critic # routes to claude-sonnet-4-6
prompt: prompts/critic.md
deps: [implementer]
inputs:
- "implementer.outputs.*"
Agent Roles
| Role | Typical model | Responsibility |
|---|---|---|
orchestrator | claude-opus-4-6 | Decomposes the goal, assigns work, synthesizes results |
worker | glm-5.1, glm-4.7 | Executes assigned tasks — code, search, transform |
critic | claude-sonnet-4-6 | Evaluates worker outputs, flags errors, rates quality |
arbiter | claude-opus-4-6 | Resolves disagreements between actors and critics |
verifier | claude-haiku-4-5 | Lightweight post-hoc check (format, schema, lint) |
The orchestrator is the only role that can create new tasks. Workers and critics only respond to assigned inputs.
z.ai vs Antigravity Model Separation
The anti-echo-chamber rule is enforced at the swarm level: a critic must not share model weights with the actor it critiques.
# CORRECT — different model families
agents:
- id: actor
model: omp/worker # z.ai / GLM family
- id: critic
model: omp/critic # Antigravity / Anthropic family
deps: [actor]
# WRONG — both map to the same family (gateway rejects this)
agents:
- id: actor
model: glm-5.1
- id: critic
model: glm-4.7 # same family as actor → rejected
deps: [actor]
The gateway validates model family assignments on swarm load. A same-family actor-critic pair causes the swarm to fail at startup with:
SwarmValidationError: critic 'critic' and actor 'actor' share model family 'zhipuai'.
Use a different family for the critic role to prevent echo-chamber bias.
Use the routing tags (omp/worker, omp/critic, etc.) rather than hardcoded model names. This lets the gateway enforce the separation automatically and apply fallback logic without swarm YAML changes.
Parallel Execution Patterns
Fan-Out / Fan-In
Run multiple workers in parallel on independent subtasks, then collect results in a single aggregator:
agents:
- id: planner
role: orchestrator
outputs: [subtasks.json]
- id: worker-a
role: worker
deps: [planner]
inputs: [subtasks.json]
slice: 0 # process subtasks[0::3]
- id: worker-b
role: worker
deps: [planner]
inputs: [subtasks.json]
slice: 1 # process subtasks[1::3]
- id: worker-c
role: worker
deps: [planner]
inputs: [subtasks.json]
slice: 2 # process subtasks[2::3]
- id: aggregator
role: orchestrator
deps: [worker-a, worker-b, worker-c]
inputs: ["worker-*.outputs.*"]
DAG with Shared Intermediate State
agents:
- id: fetch
role: worker
outputs: [raw.json]
- id: parse
role: worker
deps: [fetch]
inputs: [raw.json]
outputs: [parsed.json]
- id: analyze
role: worker
deps: [parse]
inputs: [parsed.json]
outputs: [analysis.json]
- id: summarize
role: worker
deps: [parse] # also depends on parse, not analyze
inputs: [parsed.json]
outputs: [summary.md]
- id: report
role: orchestrator
deps: [analyze, summarize] # waits for BOTH branches
inputs: [analysis.json, summary.md]
analyze and summarize run in parallel after parse completes.
Recursive Swarm (Swarm-of-Swarms)
A swarm agent can invoke a child swarm as its task:
agents:
- id: dispatcher
role: orchestrator
cmd: "omc swarm run child-swarm.yaml --input {{outputs.plan}}"
deps: [planner]
The parent swarm waits for the child swarm’s terminal state before proceeding.
Agent Isolation with rcopy
By default, each agent runs with isolation: rcopy. rcopy creates a read-only copy of the workspace for the agent:
workspace/
├── agent-planner/ ← read-only copy (rcopy)
│ └── [workspace snapshot]
├── agent-worker-0/ ← read-only copy (rcopy)
│ └── [workspace snapshot]
└── agent-critic/ ← read-only copy (rcopy)
└── [workspace snapshot]
Properties:
- Agents cannot see each other’s in-progress file edits
- An agent crashing does not corrupt the shared workspace
- Output files are explicitly declared and merged after the agent completes
Isolation modes:
| Mode | Description | Use when |
|---|---|---|
rcopy | Read-only snapshot per agent | Default; most tasks |
shared | All agents share the workspace | Agents must coordinate on files in real time |
none | No isolation, direct workspace access | Single-agent swarms, trusted tools only |
# Per-agent isolation override
agents:
- id: git-operations
isolation: shared # needs live workspace for git commands
- id: analyzer
isolation: rcopy # default, reads a snapshot
maxConcurrency
maxConcurrency limits how many instances of a given agent run simultaneously. It applies when the orchestrator creates multiple work items for the same agent template:
agents:
- id: file-processor
role: worker
maxConcurrency: 5 # at most 5 file-processor instances run at once
The orchestrator queues additional work items until a slot opens. This prevents runaway parallelism from exhausting the model gateway’s concurrency limits.
Setting maxConcurrency globally:
defaults:
maxConcurrency: 4 # applies to all agents unless overridden
Interaction with gateway limits: if maxConcurrency: 10 but the target model has rpm: 5, the gateway queues the overflow. Lower maxConcurrency to avoid queueing delays.
Environment and Secrets
Swarm agents inherit environment variables from the host, plus any declared in the swarm:
env:
OMP_GATEWAY_URL: "http://gateway:4000"
OMP_LANGFUSE_HOST: "http://langfuse:3000"
secrets:
- ANTHROPIC_API_KEY # injected from host env; never written to files
- ZHIPUAI_API_KEY
Secrets are injected at agent startup and not stored in the Dolt task graph or Langfuse traces.
Running a Swarm
# Validate YAML without executing
omc swarm validate swarm.yaml
# Dry-run: shows execution plan, no API calls
omc swarm run swarm.yaml --dry-run
# Execute
omc swarm run swarm.yaml
# Execute with input override
omc swarm run swarm.yaml --input '{"goal": "implement password reset"}'
# Watch live status
omc swarm status
# Cancel
omc swarm cancel
Swarm Output
When a swarm completes, outputs from terminal agents are collected into .omc/swarm-outputs/<run-id>/:
.omc/swarm-outputs/run-2026-06-07-1432/
├── planner/plan.json
├── implementer/impl.patch
├── critic/review.md
└── swarm.summary.json # overall status, timings, costs
swarm.summary.json includes per-agent token usage and cost, which is also forwarded to Langfuse.
Related
- Architecture Overview — swarms operate at L0 using L1 (gateway)
- Model Gateway — model routing tags and concurrency limits
- Task Graph —
bdintegration within swarm agents - Self-Evolution — swarms that modify their own configuration