Multi-Tier Autonomy
An agent without an autonomy model either asks permission for everything (useless) or acts without oversight on everything (dangerous). A tiered system defines what the agent can do alone, what it should do and report, and what requires explicit approval — giving the agent real agency within bounded risk.
The Three-Tier Model
The simplest effective autonomy model uses three tiers:
| Tier | Policy | Examples |
|---|---|---|
| Tier 1 — Just do it | Agent acts without asking or reporting | Read files, search the web, check calendars, organize workspace, routine monitoring |
| Tier 2 — Do it, then tell | Agent acts autonomously, reports to operator afterward | Execute within policy caps, adjust cron schedules, update documentation, run bounded experiments |
| Tier 3 — Ask first | Agent proposes action, waits for approval | Actions above policy caps, new/unfamiliar integrations, irreversible changes, anything uncertain |
The key rule: when in doubt, bump up a tier. An agent that asks too often is annoying. An agent that acts without permission on the wrong thing destroys trust.
Tier 1 — Just Do It
Zero-risk actions the agent can take freely without notification:
- Reading files, exploring the workspace, running searches
- Memory maintenance (daily notes, MEMORY.md updates)
- Checking email, calendar, API endpoints
- Organizing files, cleaning up workspace
- Routine heartbeat/monitoring tasks
Tier 2 — Do It, Tell the Operator
Actions within established policy bounds. The agent executes, then reports what it did:
- Running operations within pre-set caps (budget limits, rate limits)
- Creating issues, opening PRs, merging code that passes quality gates
- Adjusting cron schedules or configuration within existing patterns
- Making decisions within a delegated domain (the agent owns this area)
- Responding to routine situations using established playbooks
The operator reviews reports and course-corrects. This is where trust builds — successful Tier 2 actions expand the agent's envelope over time.
Tier 3 — Ask First
Actions that exceed established bounds or carry meaningful risk:
- Spending above per-action or per-day policy caps
- Interacting with new/unfamiliar external services or contracts
- Irreversible actions (deleting data, sending public communications)
- Major strategy changes or entering new domains
- Anything the agent is genuinely uncertain about
The Four-Tier Model (Supervisory)
When agents supervise other agents, a fourth tier adds clarity about what the subordinate agent can escalate vs. what requires the operator:
| Tier | Policy | Scope |
|---|---|---|
| Tier 1 — Autonomous | Act freely, no notification | Routine operations within established patterns |
| Tier 2 — Act & Report | Execute within bounds, report afterward | Domain-specific actions within policy caps |
| Tier 3 — Escalate to supervisor | Propose to the supervising agent | Cross-domain decisions, novel situations, budget allocation between tasks |
| Tier 4 — Escalate to operator | Requires human approval | Major strategy changes, public communications, new external integrations, anything above supervisor's authority |
Why Four Tiers for Multi-Agent
In a single-agent setup, Tiers 1-3 are sufficient — the agent either acts alone or asks the human. But when Agent A supervises Agent B:
- Agent B handling its own domain tasks → Tier 1-2 (B acts autonomously within its role)
- Agent B needing cross-domain coordination → Tier 3 (B escalates to Agent A, the supervisor)
- Agent A facing something beyond its own authority → Tier 4 (A escalates to the operator)
Without Tier 3 (supervisor escalation), subordinate agents either over-escalate to the human (bypassing the supervisor) or make cross-domain decisions they shouldn't.
Operator (human)
↑ Tier 4 escalation
Supervisor Agent (e.g., orchestrator)
↑ Tier 3 escalation
Worker Agent (e.g., coder, researcher)
└─ Tiers 1-2: autonomous within roleExample: Product Building Setup
A supervisor agent coordinates multiple specialist roles:
| Agent | Tier 1 (just do it) | Tier 2 (do & report) | Tier 3 (ask supervisor) | Tier 4 (ask operator) |
|---|---|---|---|---|
| Coder | Read code, run tests, lint | Implement issues, open PRs, fix CI | Architecture changes, new dependencies | — |
| Reviewer | Review PRs, check quality | Approve/request changes | Spec divergence, blocked PRs | — |
| Supervisor | Triage issues, dispatch roles | Merge PRs, update roadmap | — | Major spec changes, public comms |
The supervisor handles Tier 3 escalations from workers, resolving most coordination internally. Only genuine Tier 4 situations reach the human.
Implementing Autonomy Tiers
In SOUL.md
Define the decision framework with concrete examples per tier. Abstract descriptions ("use good judgment") don't work — agents need specific scenarios to pattern-match against.
## Decision Framework
**Tier 1 — Just do it:**
- Read files, search the web, check calendars
- Routine monitoring and maintenance tasks
- Heartbeat checks, memory updates
**Tier 2 — Do it, tell the operator:**
- Execute trades/swaps under $50 per action
- Merge PRs that pass all quality gates
- Adjust cron schedules within existing patterns
**Tier 3 — Ask first:**
- Any action above $50
- New external service integrations
- Public communications (tweets, emails, forum posts)
- Anything irreversible or uncertainIn Agent Config (Supervisory)
For multi-agent setups, define escalation paths in the shared directory:
## Escalation Model
- Worker agents escalate to supervisor via sessions_send
- Supervisor resolves cross-domain coordination internally
- Supervisor escalates to operator for Tier 4 decisions
- Workers never escalate directly to operator (unless supervisor is unreachable)Earning Trust Over Time
Autonomy isn't static. It expands through demonstrated competence:
- Agent starts with conservative tier boundaries
- Successful Tier 2 actions build track record
- Operator relaxes caps or moves actions down a tier
- Failures contract the envelope — mistakes move actions up a tier
Example progression:
- Week 1: All external actions are Tier 3 (ask first)
- Week 3: Routine actions within $20 cap move to Tier 2 (act & report)
- Month 2: Cap raised to $50, more action types delegated
- Ongoing: Each failure triggers a review of whether the tier was appropriate
Anti-Patterns
Tier inflation
Every edge case gets pushed to Tier 3 → operator drowns in approval requests → agent feels useless. Fix: If an action type has been approved 5+ times without issue, move it to Tier 2.
Tier deflation
Agent treats everything as Tier 1 → takes irreversible actions without oversight → trust destruction. Fix: Explicit "when in doubt, bump up" rule. Err on the side of asking.
Flat hierarchy in multi-agent
All agents escalate directly to the operator → operator becomes the bottleneck for every cross-domain question. Fix: Add the supervisory tier. Most coordination should resolve at the agent layer.
Stale boundaries
Caps and tier assignments set at launch never get updated → agent operates under unnecessarily conservative constraints months later. Fix: Periodic autonomy review (monthly or quarterly). Ask: "which Tier 3 actions should be Tier 2 by now?"