Orchestration

godotz.ai orchestrates agents through Swarm YAML files. A swarm is a declarative description of agents, their roles, model assignments, dependencies, and resource limits. The OMC runtime reads the YAML and coordinates execution without requiring agents to know about each other.


Swarm YAML Structure

# swarm.yaml
version: "1"
name: code-review-swarm

# Global defaults inherited by all agents unless overridden
defaults:
  maxConcurrency: 4
  timeoutSeconds: 300
  model: omp/worker
  isolation: rcopy

agents:
  - id: planner
    role: orchestrator
    model: omp/orchestrator        # routes to claude-opus-4-6
    prompt: prompts/planner.md
    outputs:
      - plan.json

  - id: implementer
    role: worker
    model: omp/worker              # routes to glm-5.1 / glm-4.7
    prompt: prompts/implementer.md
    deps: [planner]
    inputs:
      - plan.json
    maxConcurrency: 3              # up to 3 parallel implementer instances

  - id: critic
    role: critic
    model: omp/critic              # routes to claude-sonnet-4-6
    prompt: prompts/critic.md
    deps: [implementer]
    inputs:
      - "implementer.outputs.*"

Agent Roles

RoleTypical modelResponsibility
orchestratorclaude-opus-4-6Decomposes the goal, assigns work, synthesizes results
workerglm-5.1, glm-4.7Executes assigned tasks — code, search, transform
criticclaude-sonnet-4-6Evaluates worker outputs, flags errors, rates quality
arbiterclaude-opus-4-6Resolves disagreements between actors and critics
verifierclaude-haiku-4-5Lightweight post-hoc check (format, schema, lint)

The orchestrator is the only role that can create new tasks. Workers and critics only respond to assigned inputs.


z.ai vs Antigravity Model Separation

The anti-echo-chamber rule is enforced at the swarm level: a critic must not share model weights with the actor it critiques.

# CORRECT — different model families
agents:
  - id: actor
    model: omp/worker     # z.ai / GLM family
  - id: critic
    model: omp/critic     # Antigravity / Anthropic family
    deps: [actor]

# WRONG — both map to the same family (gateway rejects this)
agents:
  - id: actor
    model: glm-5.1
  - id: critic
    model: glm-4.7        # same family as actor → rejected
    deps: [actor]

The gateway validates model family assignments on swarm load. A same-family actor-critic pair causes the swarm to fail at startup with:

SwarmValidationError: critic 'critic' and actor 'actor' share model family 'zhipuai'.
Use a different family for the critic role to prevent echo-chamber bias.

Use the routing tags (omp/worker, omp/critic, etc.) rather than hardcoded model names. This lets the gateway enforce the separation automatically and apply fallback logic without swarm YAML changes.


Parallel Execution Patterns

Fan-Out / Fan-In

Run multiple workers in parallel on independent subtasks, then collect results in a single aggregator:

agents:
  - id: planner
    role: orchestrator
    outputs: [subtasks.json]

  - id: worker-a
    role: worker
    deps: [planner]
    inputs: [subtasks.json]
    slice: 0    # process subtasks[0::3]

  - id: worker-b
    role: worker
    deps: [planner]
    inputs: [subtasks.json]
    slice: 1    # process subtasks[1::3]

  - id: worker-c
    role: worker
    deps: [planner]
    inputs: [subtasks.json]
    slice: 2    # process subtasks[2::3]

  - id: aggregator
    role: orchestrator
    deps: [worker-a, worker-b, worker-c]
    inputs: ["worker-*.outputs.*"]

DAG with Shared Intermediate State

agents:
  - id: fetch
    role: worker
    outputs: [raw.json]

  - id: parse
    role: worker
    deps: [fetch]
    inputs: [raw.json]
    outputs: [parsed.json]

  - id: analyze
    role: worker
    deps: [parse]
    inputs: [parsed.json]
    outputs: [analysis.json]

  - id: summarize
    role: worker
    deps: [parse]          # also depends on parse, not analyze
    inputs: [parsed.json]
    outputs: [summary.md]

  - id: report
    role: orchestrator
    deps: [analyze, summarize]   # waits for BOTH branches
    inputs: [analysis.json, summary.md]

analyze and summarize run in parallel after parse completes.

Recursive Swarm (Swarm-of-Swarms)

A swarm agent can invoke a child swarm as its task:

agents:
  - id: dispatcher
    role: orchestrator
    cmd: "omc swarm run child-swarm.yaml --input {{outputs.plan}}"
    deps: [planner]

The parent swarm waits for the child swarm’s terminal state before proceeding.


Agent Isolation with rcopy

By default, each agent runs with isolation: rcopy. rcopy creates a read-only copy of the workspace for the agent:

workspace/
├── agent-planner/       ← read-only copy (rcopy)
│   └── [workspace snapshot]
├── agent-worker-0/      ← read-only copy (rcopy)
│   └── [workspace snapshot]
└── agent-critic/        ← read-only copy (rcopy)
    └── [workspace snapshot]

Properties:

  • Agents cannot see each other’s in-progress file edits
  • An agent crashing does not corrupt the shared workspace
  • Output files are explicitly declared and merged after the agent completes

Isolation modes:

ModeDescriptionUse when
rcopyRead-only snapshot per agentDefault; most tasks
sharedAll agents share the workspaceAgents must coordinate on files in real time
noneNo isolation, direct workspace accessSingle-agent swarms, trusted tools only
# Per-agent isolation override
agents:
  - id: git-operations
    isolation: shared    # needs live workspace for git commands
  - id: analyzer
    isolation: rcopy     # default, reads a snapshot

maxConcurrency

maxConcurrency limits how many instances of a given agent run simultaneously. It applies when the orchestrator creates multiple work items for the same agent template:

agents:
  - id: file-processor
    role: worker
    maxConcurrency: 5    # at most 5 file-processor instances run at once

The orchestrator queues additional work items until a slot opens. This prevents runaway parallelism from exhausting the model gateway’s concurrency limits.

Setting maxConcurrency globally:

defaults:
  maxConcurrency: 4   # applies to all agents unless overridden

Interaction with gateway limits: if maxConcurrency: 10 but the target model has rpm: 5, the gateway queues the overflow. Lower maxConcurrency to avoid queueing delays.


Environment and Secrets

Swarm agents inherit environment variables from the host, plus any declared in the swarm:

env:
  OMP_GATEWAY_URL: "http://gateway:4000"
  OMP_LANGFUSE_HOST: "http://langfuse:3000"

secrets:
  - ANTHROPIC_API_KEY    # injected from host env; never written to files
  - ZHIPUAI_API_KEY

Secrets are injected at agent startup and not stored in the Dolt task graph or Langfuse traces.


Running a Swarm

# Validate YAML without executing
omc swarm validate swarm.yaml

# Dry-run: shows execution plan, no API calls
omc swarm run swarm.yaml --dry-run

# Execute
omc swarm run swarm.yaml

# Execute with input override
omc swarm run swarm.yaml --input '{"goal": "implement password reset"}'

# Watch live status
omc swarm status

# Cancel
omc swarm cancel

Swarm Output

When a swarm completes, outputs from terminal agents are collected into .omc/swarm-outputs/<run-id>/:

.omc/swarm-outputs/run-2026-06-07-1432/
├── planner/plan.json
├── implementer/impl.patch
├── critic/review.md
└── swarm.summary.json     # overall status, timings, costs

swarm.summary.json includes per-agent token usage and cost, which is also forwarded to Langfuse.