Model Routing
Not every task needs your strongest model. This is the single biggest cost lever.
Route by Task, Not by Default
| Task Type | Model Tier | Examples |
|---|---|---|
| Heartbeats, monitoring, routine checks | Cheap (Haiku, Gemini Flash) | Health pings, calendar scans, email checks |
| Standard work, conversations, tool use | Mid-tier (Sonnet, GPT-5.4) | Chat, research, file operations |
| Complex reasoning, architecture, planning | Premium (Opus) | Design docs, multi-step analysis, code review |
Set your default model to the cheapest one that reliably follows instructions. Override to stronger models per-agent or per-cron where reasoning quality actually matters. Operators have seen prompt costs drop from 20-40k tokens to ~1.5k per request by routing smarter.
Configuration
jsonc
{
"agents": {
"defaults": {
"model": {
"primary": "openai-codex/gpt-5.4",
"fallbacks": ["anthropic/claude-sonnet-4-6"]
}
},
"list": [
{
"id": "main",
"model": { "primary": "anthropic/claude-opus-4-6" }
}
]
}
}The main (interactive) agent gets the premium model. Everything else starts cheap and gets upgraded only where quality suffers.
For model behavioral differences (tool compliance, writing quality, failure modes), see Model Behavioral Differences.
For advanced routing patterns (scout/dispatch, thinking parameters, fallback chains), see Model Optimization.