Lesson Promotion Pipeline
Stages 3 and 4 of the learning pipeline — where recurring patterns get proposed as permanent rules and the human operator decides which ones stick.
Stage 3: Agent Proposes
Lessons flow upward through a review queue:
- Agent writes lesson candidate in the daily note with a severity tag (
[RULE],[PATTERN],[HEURISTIC]) - Agent scans recent candidates during memory maintenance (heartbeat reviews). If a candidate has appeared 2+ times across different days, or is a
[RULE]with high severity, it writes a proposed change tomemory/pending-rules.md - Pending rules include: the proposed text, evidence (which daily notes triggered it), the target file (AGENTS.md or SOUL.md), and current status
Example from production
## 2026-03-07
1. **[RULE] Mechanical tool-call ordering for mistake acknowledgments.**
- Evidence: 2026-03-07 — 3 violations in one session
- Proposed for AGENTS.md Self-Learning section
- Status: ⏳ Pending review
2. **[PATTERN] Design-first for new projects: design doc → implementation.**
- Evidence: 2026-03-07 — skipped design, went straight to code with wrong model
- Proposed for SOUL.md Lessons Learned
- Status: ⏳ Pending reviewPromotion thresholds
Not every mistake deserves a permanent rule. Two filters control what gets promoted:
- Frequency threshold: A candidate needs 2+ appearances across different days before promotion (unless it's a high-severity
[RULE]with immediate impact) - Severity routing:
[RULE](hard constraint) → AGENTS.md standing orders[PATTERN](methodology) → SOUL.md lessons ormemory/playbook/[HEURISTIC](soft guidance) → usually stays in daily notes; only promoted if it recurs enough to become a pattern
Stage 4: Human Review Gate
The operator reviews pending rules and approves, rejects, or modifies them. This is the critical human-in-the-loop checkpoint.
- Approved rules get added to AGENTS.md (standing orders) or SOUL.md (behavioral patterns)
- Rejected rules stay in pending-rules.md with reasoning — this prevents the agent from re-proposing the same rule
- Modified rules get the operator's refined version, which is often more nuanced than the agent's original proposal
Why human review is non-negotiable
Without it, the agent's instinct to over-systematize produces rule bloat. Every edge case becomes a standing order. Every failure spawns a new guardrail. The operator's job is to distinguish between "this needs a rule" and "this was a one-off."
The agent is good at identifying that something went wrong. The human is better at judging whether the fix should be a permanent rule or a contextual note.