Self-Improvement & Learning Loops
An agent that can learn from its own mistakes is powerful. An agent that learns without guardrails produces automation managing automation — dozens of scripts that nobody uses, self-referential improvement loops, and workspace bloat. This section covers how to build learning loops that actually improve the agent, with human-in-the-loop promotion to prevent runaway self-modification.
Key Problems
Unsupervised improvement produces churn
Without human guidance on what to improve, the agent optimizes for volume of changes rather than quality of outcomes. In practice, unchecked self-improvement loops have produced 30+ scripts that needed to be gutted in a single remediation session. The lesson: iteration needs direction.
The lesson promotion pipeline needs measurement
The pipeline exists — daily note → lesson candidate → pending rule → approved rule — but there's no data on how well it works. How many candidates get promoted? How many promoted rules actually change behavior? Is the pipeline just bureaucracy, or does it produce real improvement?
Document-before-acknowledge is the hardest rule
"Write the lesson FIRST, then reply to the conversation" is the standing order. In practice, it's the most frequently violated rule. The conversational pull to respond immediately is strong, and the documentation step feels like friction. Measuring and improving compliance on this one rule would validate the entire learning loop approach.
Auto-evolve vs. propose-and-review
Some changes are safe to auto-apply (updating a daily note, adding a tag). Others need human review (modifying AGENTS.md standing orders, changing cron schedules). The boundary between "safe to self-modify" and "needs approval" is the core design decision of any self-improvement system.
Tracks
- Learning loop architectures — pipeline designs, feedback mechanisms, promotion criteria
- Lesson extraction patterns — how to identify actionable lessons from operational noise
- Guardrails against runaway self-modification — what to auto-apply, what to gate, how to detect churn
Status
Framework needed. The lesson pipeline exists as a concept but hasn't been measured or validated. This section needs a measurement framework before optimization work can begin.