Getting Started

This guide distills lessons from running OpenClaw agents in production — daily crons, multi-agent coordination, monitoring, automation, and everything in between. These aren't hypotheticals; they're patterns that survived contact with reality.

The Foundation

1. Give Your Agent a Brain Before Giving It Tasks

An unconfigured agent is a chatbot with tool access. It has no memory of who you are, no operational rules, and no judgment about what's appropriate. The workspace files are what turn it into something useful.

Start with these four files:

AGENTS.md — operational rules, standing orders, safety constraints. This is where you encode "never do X," "always check Y before Z," and "ask me before doing anything irreversible." Be specific. Vague rules get vague compliance.
SOUL.md — personality, decision framework, communication style. Defines how the agent thinks, not just what it does. Include error recovery patterns and autonomy tiers (what it can do alone vs. what needs approval).
USER.md — context about you. Timezone, communication preferences, key accounts. The agent reads this every session so it doesn't have to re-learn basics.
TOOLS.md — environment-specific notes. Credential locations, API endpoints, device names, infrastructure details. Keeps operational knowledge separate from behavioral rules.

The most important guardrails to add early:

Anti-looping: "if you've attempted the same action twice with the same approach, stop and report"
Verification: "check the result before reporting success"
Escalation: "3 consecutive failures on the same task → alert me"
File-first memory: "if you want to remember something, write it to a file — mental notes don't survive sessions"

You'll iterate on these files constantly. That's expected. Every failure teaches you a new rule to add.

2. Route Models by Task, Not by Default

Not every task needs your strongest model. This is the single biggest cost lever.

Task Type	Model Tier	Examples
Heartbeats, monitoring, routine checks	Cheap (Haiku, Gemini Flash)	Health pings, calendar scans, email checks
Standard work, conversations, tool use	Mid-tier (Sonnet, GPT-5.4)	Chat, research, file operations
Complex reasoning, architecture, planning	Premium (Opus)	Design docs, multi-step analysis, code review

Set your default model to the cheapest one that reliably follows instructions. Override to stronger models per-agent or per-cron where reasoning quality actually matters. Operators have seen prompt costs drop from 20-40k tokens to ~1.5k per request by routing smarter.

jsonc

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "openai-codex/gpt-5.4",
        "fallbacks": ["anthropic/claude-sonnet-4-6"]
      }
    },
    "list": [
      {
        "id": "main",
        "model": { "primary": "anthropic/claude-opus-4-6" }
      }
    ]
  }
}

3. Set Up Memory Search Early

Without memory search, your agent wakes up every session with amnesia. It can read files it knows about, but it can't recall — it can't find the lesson it wrote last Tuesday, the decision you made about API design, or the mistake it already learned from.

Memory search gives your agent semantic recall over its own workspace. It's the difference between an agent that repeats the same mistakes and one that genuinely learns across sessions.

Why it compounds:

Every lesson, decision, and preference becomes findable, not just written down
Agents self-correct faster because they can check "have I seen this before?"
The more your agent writes to daily notes and MEMORY.md, the more valuable search becomes
Daily notes become a searchable knowledge base, not just a log

Setup is ~10 minutes:

Install Ollama and pull an embedding model (ollama pull bge-m3)
Add memorySearch config to openclaw.json
Restart the gateway

Zero API cost, no rate limits, works offline. See the full Memory Search guide for setup, model selection, and tuning.

TIP

bge-m3 (1.2 GB) for quality, nomic-embed-text (274 MB) if RAM is tight. Cloud providers (OpenAI, Gemini, Voyage) also supported if you'd rather not run Ollama.

4. Automation Means Crons, Not Long-Running Sessions

Sessions are stateful only while active. If you ask your agent to work on something and close the chat, it doesn't keep going. There's no background daemon thinking about your problem.

For scheduled work: Create cron jobs with sessionTarget: "isolated". These spin up independent sessions on a schedule, do their work, and optionally report results.

jsonc

// Morning briefing: runs at 8 AM, sends summary to your chat
{
  "schedule": { "kind": "cron", "expr": "0 8 * * *", "tz": "America/New_York" },
  "sessionTarget": "isolated",
  "payload": {
    "kind": "agentTurn",
    "message": "Check email, calendar, and pending tasks. Send a morning summary."
  },
  "delivery": { "mode": "announce" }
}

Key insight learned the hard way: a script without a cron is a tool, not automation. In one production setup, an entire task operator was built and never scheduled — hours of work that sat idle until it was wired to a cron in the same session.

5. One Integration at a Time

Every integration (email, calendar, messaging, web scraping) is a separate failure mode with its own configuration, credentials, and edge cases. Trying to set up everything at once leads to debugging five things simultaneously with no working baseline.

The pattern that works:

Pick one workflow (e.g., a morning briefing that checks email)
Get it working end-to-end — config, credentials, test run, verify output
Add a cron so it runs automatically
Add error handling (what happens when the email server is down?)
Move to the next integration

Run openclaw doctor --fix when things break. It catches most config issues.

6. Persist What Matters, Delete What Doesn't

Compaction loses detail over time. Long conversations get summarized. Files are how your agent maintains continuity across sessions.

What to persist:

Decisions and rationale — why you chose X over Y (AGENTS.md or MEMORY.md)
Credentials and endpoints — in TOOLS.md, secrets in OS keychain
Operational state — JSON for machine-readable, markdown for human-readable
Lessons learned — daily notes, promoted to AGENTS.md when proven durable

What to cut:

Blank template files with no content (they waste tokens every session)
Duplicate rules across multiple files (pick one canonical location)
Transient data that changes every run
Stale state files that no longer reflect reality (rebuild from source, don't patch)

A common pattern is a workspace backup cron that syncs critical files hourly. The backup is silent when nothing changes and sends an alert on failure. That's the model — infrastructure should be invisible until it breaks.

7. Test Models on Real Agent Tasks

Chat benchmarks don't predict agent performance. Tool-calling reliability, instruction following, and structured output matter more than raw reasoning scores.

What to test:

Can it make well-formed tool calls consistently?
Does it follow multi-step instructions without skipping steps?
Does it verify results before claiming success?
Does it handle errors gracefully or spiral?

Operators have seen models that benchmark beautifully but hallucinate tool execution — claiming to have run commands and produced files that don't exist. Test on your actual workflows before committing.

8. Expect Iteration

The gap between a first conversation and reliable daily autonomous operation is real. It closes with each release, but it's still measured in weeks, not hours.

Realistic trajectory:

Week 1: Basic setup, first working conversation, initial workspace files, first "why did it do that?" moment
Week 2: First cron job, one integration working end-to-end, initial guardrails from mistakes
Week 3-4: Multiple integrations, reliable crons, agent personality solidifying, learning from daily notes
Month 2+: Autonomous operation within bounded domains, self-improvement loops, the agent starts catching things before you do

Every failure is a rule you didn't write yet. Every repeated mistake is a guardrail gap. The workspace files are a living document — they get better because things go wrong.

What's Next

Once the basics are working:

Workspace Files & Bootstrap — structure your files and optimize what gets injected
Memory Search — semantic recall setup and tuning
Model Optimization — advanced routing and cost strategies
Scheduling & Automation — crons, sentinels, and native schedulers
Agent Architecture — multi-agent identity, routing, and coordination
Agentic Patterns — scout/dispatch, research fleets, pipelines

Getting Started ​

The Foundation ​

1. Give Your Agent a Brain Before Giving It Tasks ​

2. Route Models by Task, Not by Default ​

3. Set Up Memory Search Early ​

4. Automation Means Crons, Not Long-Running Sessions ​

5. One Integration at a Time ​

6. Persist What Matters, Delete What Doesn't ​

7. Test Models on Real Agent Tasks ​

8. Expect Iteration ​

What's Next ​