Skip to content

feat(engine): loop-brake — trip the supervisor on repeated non-advancing results#363

Merged
mgoldsborough merged 2 commits into
mainfrom
feat/loop-brake-non-advancing
Jun 2, 2026
Merged

feat(engine): loop-brake — trip the supervisor on repeated non-advancing results#363
mgoldsborough merged 2 commits into
mainfrom
feat/loop-brake-non-advancing

Conversation

@mgoldsborough
Copy link
Copy Markdown
Contributor

Follow-up to #361. That PR fixed the cache bug that made a bundle's tools invisible; this adds the behavioral brake for the other half of the same incident — an agent that re-searches for a capability that genuinely isn't there instead of stopping.

Problem

The loop supervisor fingerprints (toolName, isError, content[, input]) and trips after N identical fingerprints. A discovery loop slips through every axis: the model calls nb__search with a fresh query each turn and gets a distinct No tools matched "<query>" back, so content and input vary and the fingerprint never repeats — the run burns iterations to the cap. Input-aware success fingerprinting (#324) makes this structural: a varied input reads as progress.

Fix

A third "stuck" axis: a tool may mark a result non-advancing. The supervisor collapses consecutive non-advancing results from one tool to a single canonical fingerprint, so N in a row trip regardless of how input/content varied; the tool is then disabled for the run and the model is nudged to surface the gap.

Why _meta (and not structuredContent or a top-level field)

The signal is metadata about a result, not the tool's data — so it belongs in MCP's _meta, the blessed out-of-band channel, under a reverse-DNS key (ai.nimblebrain/non-advancing, matching the spec's own io.modelcontextprotocol/… convention). A top-level ToolResult field is dropped at the MCP boundary every tool result crosses (in-process system tools included); structuredContent is the tool's data payload and overloading it is a smell. _meta is the correct container and round-trips by construction.

To make it reachable end-to-end:

  • ToolResult gains a _meta slot mirroring CallToolResult._meta.
  • The in-process tool boundary forwards result _meta onto the wire; McpSource reads result-level _meta back on both the inline and task paths. _meta is a looseObject per the MCP schema, so arbitrary reverse-DNS keys survive — verified by a round-trip test through a real in-process MCP transport (not just schema-reading).
  • nb__search's no-match branch sets the key. Any tool, in-process or bundle, opts in the same way.

The supervisor reads the flag by key, not by folding _meta into the hash — so the input-aware success path is untouched.

Composition

Tests

  • Round-trip (non-advancing-meta.test.ts): a tool's _meta set in its handler survives the in-process MCP boundary and lands on the engine-side ToolResult._meta. Empirical proof the channel works.
  • Supervisor: 3 non-advancing results trip even with input+content varying every call; an advancing result resets the streak; varied-input real work (patch_source-style) still never trips. Verified red without the _meta fingerprint branch.
  • nb__search: no-match sets the _meta key; a match does not.

Full verify green locally: static checks, 3316 unit + 666 integration tests, 0 failures.

When a tool keeps reporting no progress while the model varies its input — the
canonical case being nb__search called with a fresh query each turn, each
returning a distinct "No tools matched <query>" — the supervisor's
(toolName, isError, content[, input]) fingerprint never repeats, so it never
trips and the run burns iterations re-searching for a capability that isn't
there. Input-aware success fingerprinting makes this structural: a varied
input reads as progress.

Add a third "stuck" axis. A tool may mark a result non-advancing; the
supervisor collapses consecutive non-advancing results from one tool to a
single canonical fingerprint, tripping after N regardless of how input or
content varied. The tool is then disabled for the run and the model is nudged
to surface the gap instead of searching on.

The signal rides in MCP `_meta` (the blessed channel for metadata-about-a-
result) under a reverse-DNS key (ai.nimblebrain/non-advancing) — not in
structuredContent (the tool's data) or a bespoke top-level field (dropped at
the boundary). To make it reachable:

- ToolResult gains a `_meta` slot mirroring CallToolResult._meta.
- The in-process tool boundary forwards result `_meta` onto the wire; McpSource
  reads result-level `_meta` back on both the inline and task paths. `_meta` is
  a loose object per the MCP schema, so arbitrary reverse-DNS keys survive the
  round-trip — verified by a test through a real in-process MCP transport.
- nb__search's no-match branch sets the key. Any tool, in-process or bundle,
  opts in the same way.

Read by key, not folded into the hash, so the input-aware success path is
untouched (varied-input real work still never trips).

Composes with the input-aware success fingerprint (#324): that fixed false
trips on progress; this fixes missed trips on no-progress. Follow-up to #361,
which removed the cache bug that triggered the original incident.
…ng doc

Review polish on PR #363:
- Supervisor synth directive now says "made no progress N times in a row"
  instead of "returned the same result N times" — accurate across all three
  trip modes (the non-advancing path varies result text, so "same result"
  was imprecise for it).
- Document where _meta forwarding lives: the two serialization boundaries
  (defineInProcessApp, McpSource); a direct ToolSource carries _meta natively.
  Answers "does delegate/upjack need a forwarding one-liner?" in the contract
  — no, they don't own a boundary, so the channel is already general.
@mgoldsborough mgoldsborough added the qa-reviewed QA review completed with no critical issues label Jun 2, 2026
@mgoldsborough mgoldsborough merged commit d08f1c3 into main Jun 2, 2026
5 checks passed
@mgoldsborough mgoldsborough deleted the feat/loop-brake-non-advancing branch June 2, 2026 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

qa-reviewed QA review completed with no critical issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant