Skip to content

Streaming Responsiveness & Queue Mode

Status: Applied to production
Area: Behavioral Quality / UX
Config paths: messages.queue.mode, channels.discord.blockStreamingCoalesce


Problem

Two distinct UX issues in interactive Discord sessions:

1. Slow, batchy streaming updates
With default coalesce settings, Discord responses felt like:

  • Brief initial message appears
  • One or two partial edits
  • Long pause
  • Massive final response arrives all at once

The blockStreamingCoalesce defaults were holding chunks too long (high idle timeout) and letting them accumulate into large batches before pushing to Discord. This created the illusion of a slow, unresponsive agent even when the model was generating output continuously.

2. No mid-run interruption
Queue mode defaulted to collect. If you sent a message while the agent was mid-run (e.g. executing 8 sequential tool calls), your message queued and only started a new turn after the full run completed. No way to say "stop, do this instead" and have it actually stop.


Root Cause

Coalesce behavior: blockStreamingCoalesce merges streamed text chunks before sending Discord edits. The default idle threshold was high enough (~1500–2000ms estimated) that multiple model output bursts would coalesce into one large chunk, reducing visible progress updates.

Queue mode: collect (default) holds all inbound messages until the current agent turn finishes. The agent processes all queued messages as a batch on the next turn — no mid-turn steering possible.


Fix

1. Queue mode → steer

json
{
  "messages": {
    "queue": {
      "mode": "steer"
    }
  }
}

What changes: Inbound messages during an active run are injected at the next tool call boundary. Remaining tool calls are skipped with "Skipped due to queued user message." and the new message takes over immediately.

Practical effect: You can interrupt a multi-tool run by sending a new message. The agent stops after the current tool completes and pivots.

Tradeoff: If a long autonomous task is running and a message arrives (e.g. from a cron notification to the same session), the task gets abandoned mid-flight. Mitigation: all cron jobs run in sessionTarget: "isolated" sessions with their own lanes — they're unaffected by main session queue mode. Verify before enabling on shared sessions.

Prerequisites for steer to work:

  • The channel must be streaming (falls back to followup if not streaming)
  • All background/cron jobs must be in isolated sessions, not sessionTarget: "main"

2. Tighter coalesce settings

json
{
  "channels": {
    "discord": {
      "blockStreamingCoalesce": {
        "idleMs": 500,
        "minChars": 200,
        "maxChars": 600
      }
    }
  }
}

What changes:

  • idleMs: 500 — flushes a pending chunk after 500ms of model idle, down from the high default. More frequent Discord edits during generation.
  • maxChars: 600 — caps chunk size at 600 chars before forcing a flush, even if idle timer hasn't fired. Was ~1200 default.
  • minChars: 200 — won't flush until at least 200 chars accumulated (prevents single-word updates that feel spammy).

Practical effect: Streaming responses feel more like watching the agent think in real time, rather than wait → dump. Especially noticeable on long tool-heavy responses.

Tradeoff: More Discord API edits per message = higher rate limit exposure on very long responses. The 200/600 char window is a reasonable middle ground — not so aggressive it spams the API, but enough to feel live.


Combined Config Patch

Applied in a single config.patch:

json
{
  "messages": {
    "queue": {
      "mode": "steer"
    }
  },
  "channels": {
    "discord": {
      "blockStreamingCoalesce": {
        "idleMs": 500,
        "minChars": 200,
        "maxChars": 600
      }
    }
  }
}

Requires soft restart (SIGUSR1) to take effect.


Pre-Conditions Checklist

Before applying these settings to any OpenClaw deployment:

  • [ ] All cron jobs use sessionTarget: "isolated" (verify with cron list --include-disabled)
  • [ ] No cron jobs use sessionTarget: "main" (steer mode would interrupt them)
  • [ ] Discord streaming is enabled (channels.discord.streaming: "partial" or "block")
  • [ ] Discord rate limits not already close to threshold (check for 429s in recent logs)

Observations

To be updated after a few days of operation.

  • [ ] Streaming feels more responsive (subjective)
  • [ ] Any rate limit issues observed?
  • [ ] Steer mode used in practice? Helpful or disruptive?
  • [ ] Optimal idleMs — could go lower (300ms?) or needs to be raised?

PathTypeNotes
messages.queue.modestring"steer" | "followup" | "collect"
channels.discord.blockStreamingCoalesce.idleMsintegerms of model idle before flush
channels.discord.blockStreamingCoalesce.minCharsintegermin chars before flush
channels.discord.blockStreamingCoalesce.maxCharsintegermax chars before forced flush
channels.discord.streamingstring"off" | "partial" | "block" | "progress"

Per-account overrides: same coalesce fields available under channels.discord.accounts.<id>.blockStreamingCoalesce.

Built with OpenClaw 🤖