The OpenClaw + OpenCode pillar makes sub-agents sound easy — "spawn a database agent to optimize queries, a frontend agent to bundle assets, and a deployment agent to handle staging — then merge the results." That's the marketing version. The practitioner version has a few more steps and a few more ways to set fire to your token budget.
This post covers the orchestration patterns we actually use across our agency, and the patterns we learned to stop using.
A sub-agent in OpenClaw is a fresh agent context spawned from a parent agent's instruction. New conversation, new system prompt, no inherited memory unless you pass it explicitly. The parent waits for results, the child works, the parent integrates.
The point isn't parallelism. It's context isolation. A sub-agent doesn't know about the parent's codebase, the parent's prior decisions, or the parent's running token bill. It knows what you give it.
That's the whole framing. Sub-agents are useful when the work has a clean interface (input → output) and the parent context is a liability (too long, too noisy, too biased toward one approach).
The most common use. Parent has a multi-step task; spawns one sub-agent per step.
parent: "Migrate the user_profile table to add a deleted_at column,
write the migration, update the ORM model, update the
repository tests, and document the change."
├── subagent[migration]: writes the SQL migration
├── subagent[model]: updates the ORM model
├── subagent[tests]: updates repository tests for the new column
└── subagent[docs]: writes a changelog entry
parent: integrates the four outputs, commits.
Why it works: each sub-agent has a tight scope and a clean output contract. The parent doesn't burn tokens on the migration syntax once the sub-agent has written it; it just keeps the result.
Why it sometimes doesn't: the steps aren't actually independent. The migration choices affect the model choices affect the test setup. If you run them all in parallel, you get incoherent outputs that don't compose.
Rule we follow: decompose by step only when each step's output is genuinely consumable by the next as input. If steps share unstated assumptions, run them sequentially in the parent or batch them into one larger sub-agent.
Parent needs N variations of the same operation. Spawn N sub-agents in parallel, each on one item.
parent: "Audit each of the 14 service pages for the LocalBusiness
schema fix from PR-127."
├── subagent[manila-web]: checks /manila-web-development
├── subagent[bulacan]: checks /bulacan-software-development
├── ... (12 more in parallel)
└── subagent[cebu]: checks /cebu-app-development
parent: collates results into a single report.
Why it works: each sub-agent's task is identical in shape but operates on disjoint files. Parallelism is real — wallclock time drops to the slowest sub-agent's runtime, not the sum.
Why it sometimes doesn't: rate limits. 14 parallel calls to your LLM provider will hit per-minute caps fast. Throttle to your actual concurrency budget — for our agency, that's 4-6 simultaneous sub-agents on the cheap tier, 2-3 on Opus.
Rule we follow: cap concurrency at half your provider's per-minute rate divided by expected per-task duration in seconds. Above that, you're queueing in the provider rather than running parallel.
Parent generates a draft, spawns a sub-agent to review with a different model, merges feedback.
parent (Kimi K2): writes a draft migration
subagent[reviewer] (Opus): reviews the draft for foreign-key
issues, transactional safety, and naming consistency
parent: applies reviewer's suggestions, commits.
Why it works: different models have different blind spots. K2 will produce a fast, mostly-right migration; Opus will catch the constraint implications K2 misses. Cost stays low because the reviewer only sees the diff, not the whole context.
We use this pattern as a quality gate on anything tagged schema, migration, auth, or security. The Kimi K2 + OpenClaw routing post explains the cost math in detail.
Rule we follow: the reviewer must be a different model than the writer. Same-model review is theater — it'll confirm the writer's mistakes because it shares them.
Spawn a sub-agent at session start, keep it alive across many parent turns. The specialist accumulates context in its narrow domain; the parent stays focused on its own.
session start:
parent: orchestrator
subagent[db]: persistent database specialist
subagent[deploy]: persistent deployment specialist
throughout session:
parent ↔ subagent[db]: every DB question routes here
parent ↔ subagent[deploy]: every deploy question routes here
Why it works: the specialists build up domain context the parent doesn't carry. After 20 turns, the DB specialist knows the schema, the indexes, the slow queries — all without bloating the parent's context window.
Why it's expensive: persistent sub-agents accumulate tokens across the session. Cap them with /openclaw subagent <name> --max-context 20000 so they prune older history.
Rule we follow: persistent specialists earn their keep when (a) the session is long, and (b) the specialist's domain is narrow enough that domain context compounds usefully. For one-off tasks, just ask in the parent.
The temptation: parent spawns child A, which spawns grandchild A1, which spawns great-grandchild... You quickly lose track of which context owns what, why a particular decision was made, and where the cost lives. Debugging is painful, and the token bill compounds.
We had a brief period where we let agents spawn arbitrarily. It produced impressive demos and bad code. The deeper-than-2 cases were always one of:
We hard-limit depth to 2 now. Parent + immediate sub-agents. Anything that wants more is rewritten as a sequential plan in the parent.
Tempting use case: "my parent context is filling up, I'll spawn a sub-agent to do the next thing with a clean slate." Don't. The sub-agent doesn't have access to the prior decisions in the parent's context, so it'll make conflicting choices. You spend the tokens you saved on context, plus the tokens you waste reconciling the conflict.
If the parent context is too full, fix that directly — summarize, prune, restart with a session-resume. Don't paper over it with a child agent.
Sub-agents are the easiest place in the OpenClaw stack to torch budget. Three rules we enforce:
--max-tokens-per-call 2000. If the sub-agent can't finish in that, the design is wrong.--max-duration 5m. Stuck loops happen; cap them./opencode cost --by-subagent. If one of them is doing 80% of the spend without obviously earning it, kill that pattern.The opencode-controller skill gives you the visibility to do this. Without that visibility, you're flying blind on multi-agent runs.
If you're building autonomous workflows with OpenClaw, these patterns plus the model routing strategy are most of what you need. If you're coming from Claude Code's Task tool or one of the native swarm frameworks, the conceptual map is similar — the difference is OpenClaw makes you explicit about cost and context where the SaaS tools obscure both.
Founder & Lead Developer
With 8+ years building software from the Philippines, Jomar has served 50+ US, Australian, and UK clients. He specializes in construction SaaS, enterprise automation, and helping Western companies build high-performing Philippine development teams.
Tell us what you're building. We'll show you the fastest path to a production-ready launch.
Get My Free Proposal