Skip to content

Multi-Tier Autonomy

An agent without an autonomy model either asks permission for everything (useless) or acts without oversight on everything (dangerous). A tiered system defines what the agent can do alone, what it should do and report, and what requires explicit approval — giving the agent real agency within bounded risk.

The Three-Tier Model

The simplest effective autonomy model uses three tiers:

TierPolicyExamples
Tier 1 — Just do itAgent acts without asking or reportingRead files, search the web, check calendars, organize workspace, routine monitoring
Tier 2 — Do it, then tellAgent acts autonomously, reports to operator afterwardExecute within policy caps, adjust cron schedules, update documentation, run bounded experiments
Tier 3 — Ask firstAgent proposes action, waits for approvalActions above policy caps, new/unfamiliar integrations, irreversible changes, anything uncertain

The key rule: when in doubt, bump up a tier. An agent that asks too often is annoying. An agent that acts without permission on the wrong thing destroys trust.

Tier 1 — Just Do It

Zero-risk actions the agent can take freely without notification:

  • Reading files, exploring the workspace, running searches
  • Memory maintenance (daily notes, MEMORY.md updates)
  • Checking email, calendar, API endpoints
  • Organizing files, cleaning up workspace
  • Routine heartbeat/monitoring tasks

Tier 2 — Do It, Tell the Operator

Actions within established policy bounds. The agent executes, then reports what it did:

  • Running operations within pre-set caps (budget limits, rate limits)
  • Creating issues, opening PRs, merging code that passes quality gates
  • Adjusting cron schedules or configuration within existing patterns
  • Making decisions within a delegated domain (the agent owns this area)
  • Responding to routine situations using established playbooks

The operator reviews reports and course-corrects. This is where trust builds — successful Tier 2 actions expand the agent's envelope over time.

Tier 3 — Ask First

Actions that exceed established bounds or carry meaningful risk:

  • Spending above per-action or per-day policy caps
  • Interacting with new/unfamiliar external services or contracts
  • Irreversible actions (deleting data, sending public communications)
  • Major strategy changes or entering new domains
  • Anything the agent is genuinely uncertain about

The Four-Tier Model (Supervisory)

When agents supervise other agents, a fourth tier adds clarity about what the subordinate agent can escalate vs. what requires the operator:

TierPolicyScope
Tier 1 — AutonomousAct freely, no notificationRoutine operations within established patterns
Tier 2 — Act & ReportExecute within bounds, report afterwardDomain-specific actions within policy caps
Tier 3 — Escalate to supervisorPropose to the supervising agentCross-domain decisions, novel situations, budget allocation between tasks
Tier 4 — Escalate to operatorRequires human approvalMajor strategy changes, public communications, new external integrations, anything above supervisor's authority

Why Four Tiers for Multi-Agent

In a single-agent setup, Tiers 1-3 are sufficient — the agent either acts alone or asks the human. But when Agent A supervises Agent B:

  • Agent B handling its own domain tasks → Tier 1-2 (B acts autonomously within its role)
  • Agent B needing cross-domain coordination → Tier 3 (B escalates to Agent A, the supervisor)
  • Agent A facing something beyond its own authority → Tier 4 (A escalates to the operator)

Without Tier 3 (supervisor escalation), subordinate agents either over-escalate to the human (bypassing the supervisor) or make cross-domain decisions they shouldn't.

Operator (human)
  ↑ Tier 4 escalation
Supervisor Agent (e.g., orchestrator)
  ↑ Tier 3 escalation
Worker Agent (e.g., coder, researcher)
  └─ Tiers 1-2: autonomous within role

Example: Product Building Setup

A supervisor agent coordinates multiple specialist roles:

AgentTier 1 (just do it)Tier 2 (do & report)Tier 3 (ask supervisor)Tier 4 (ask operator)
CoderRead code, run tests, lintImplement issues, open PRs, fix CIArchitecture changes, new dependencies
ReviewerReview PRs, check qualityApprove/request changesSpec divergence, blocked PRs
SupervisorTriage issues, dispatch rolesMerge PRs, update roadmapMajor spec changes, public comms

The supervisor handles Tier 3 escalations from workers, resolving most coordination internally. Only genuine Tier 4 situations reach the human.

Implementing Autonomy Tiers

In SOUL.md

Define the decision framework with concrete examples per tier. Abstract descriptions ("use good judgment") don't work — agents need specific scenarios to pattern-match against.

markdown
## Decision Framework

**Tier 1 — Just do it:**
- Read files, search the web, check calendars
- Routine monitoring and maintenance tasks
- Heartbeat checks, memory updates

**Tier 2 — Do it, tell the operator:**
- Execute trades/swaps under $50 per action
- Merge PRs that pass all quality gates
- Adjust cron schedules within existing patterns

**Tier 3 — Ask first:**
- Any action above $50
- New external service integrations
- Public communications (tweets, emails, forum posts)
- Anything irreversible or uncertain

In Agent Config (Supervisory)

For multi-agent setups, define escalation paths in the shared directory:

markdown
## Escalation Model
- Worker agents escalate to supervisor via sessions_send
- Supervisor resolves cross-domain coordination internally
- Supervisor escalates to operator for Tier 4 decisions
- Workers never escalate directly to operator (unless supervisor is unreachable)

Earning Trust Over Time

Autonomy isn't static. It expands through demonstrated competence:

  1. Agent starts with conservative tier boundaries
  2. Successful Tier 2 actions build track record
  3. Operator relaxes caps or moves actions down a tier
  4. Failures contract the envelope — mistakes move actions up a tier

Example progression:

  • Week 1: All external actions are Tier 3 (ask first)
  • Week 3: Routine actions within $20 cap move to Tier 2 (act & report)
  • Month 2: Cap raised to $50, more action types delegated
  • Ongoing: Each failure triggers a review of whether the tier was appropriate

Anti-Patterns

Tier inflation

Every edge case gets pushed to Tier 3 → operator drowns in approval requests → agent feels useless. Fix: If an action type has been approved 5+ times without issue, move it to Tier 2.

Tier deflation

Agent treats everything as Tier 1 → takes irreversible actions without oversight → trust destruction. Fix: Explicit "when in doubt, bump up" rule. Err on the side of asking.

Flat hierarchy in multi-agent

All agents escalate directly to the operator → operator becomes the bottleneck for every cross-domain question. Fix: Add the supervisory tier. Most coordination should resolve at the agent layer.

Stale boundaries

Caps and tier assignments set at launch never get updated → agent operates under unnecessarily conservative constraints months later. Fix: Periodic autonomy review (monthly or quarterly). Ask: "which Tier 3 actions should be Tier 2 by now?"

Built with OpenClaw 🤖