Skip to content

Lesson Promotion Pipeline

Stages 3 and 4 of the learning pipeline — where recurring patterns get proposed as permanent rules and the human operator decides which ones stick.

Stage 3: Agent Proposes

Lessons flow upward through a review queue:

  1. Agent writes lesson candidate in the daily note with a severity tag ([RULE], [PATTERN], [HEURISTIC])
  2. Agent scans recent candidates during memory maintenance (heartbeat reviews). If a candidate has appeared 2+ times across different days, or is a [RULE] with high severity, it writes a proposed change to memory/pending-rules.md
  3. Pending rules include: the proposed text, evidence (which daily notes triggered it), the target file (AGENTS.md or SOUL.md), and current status

Example from production

markdown
## 2026-03-07

1. **[RULE] Mechanical tool-call ordering for mistake acknowledgments.**
   - Evidence: 2026-03-07 — 3 violations in one session
   - Proposed for AGENTS.md Self-Learning section
   - Status: ⏳ Pending review

2. **[PATTERN] Design-first for new projects: design doc → implementation.**
   - Evidence: 2026-03-07 — skipped design, went straight to code with wrong model
   - Proposed for SOUL.md Lessons Learned
   - Status: ⏳ Pending review

Promotion thresholds

Not every mistake deserves a permanent rule. Two filters control what gets promoted:

  • Frequency threshold: A candidate needs 2+ appearances across different days before promotion (unless it's a high-severity [RULE] with immediate impact)
  • Severity routing:
    • [RULE] (hard constraint) → AGENTS.md standing orders
    • [PATTERN] (methodology) → SOUL.md lessons or memory/playbook/
    • [HEURISTIC] (soft guidance) → usually stays in daily notes; only promoted if it recurs enough to become a pattern

Stage 4: Human Review Gate

The operator reviews pending rules and approves, rejects, or modifies them. This is the critical human-in-the-loop checkpoint.

  • Approved rules get added to AGENTS.md (standing orders) or SOUL.md (behavioral patterns)
  • Rejected rules stay in pending-rules.md with reasoning — this prevents the agent from re-proposing the same rule
  • Modified rules get the operator's refined version, which is often more nuanced than the agent's original proposal

Why human review is non-negotiable

Without it, the agent's instinct to over-systematize produces rule bloat. Every edge case becomes a standing order. Every failure spawns a new guardrail. The operator's job is to distinguish between "this needs a rule" and "this was a one-off."

The agent is good at identifying that something went wrong. The human is better at judging whether the fix should be a permanent rule or a contextual note.

Built with OpenClaw 🤖