Skip to content

Brain: meta-decisioning layer for session lifecycle & routing #258

@mercurialsolo

Description

@mercurialsolo

Summary

Promote the brain from "supervisor that warns" to "controller that decides" along four meta-axes: model routing (latency), compaction timing, session lifecycle (when to start fresh), and execution locus (local vs hosted). The unifying claim: every Claude Code session loses time and money to bad meta-decisions made (or not made) by the human, when the brain already has the signals to decide better.

This is an epic: shared controller surface, shared risk framing, decomposable into 4 sub-issues once design lands.

The four axes

1. Latency-aware model routing

  • Today: brain rules can route actions (Brain: tiered model routing (Haiku for obvious calls, escalate on ambiguity) #240) but the dimension is cost, not speed.
  • Proposed: classify each turn's likely difficulty (file read, simple grep, "is X true" → trivial; multi-file refactor, novel design → hard). Trivial turns → Haiku or a local model; escalate to Sonnet/Opus on uncertainty or post-hoc verification mismatch.
  • Inputs already available: diff_digest.rs risk score, decisions.rs few-shot retrieval, transcript turn type, tool call type from monitor.rs.

2. Proactive compaction triggers

  • Today: health.rs nudges the user at 50% cognitive decay; Brain: PreCompact intervention to preserve context across compaction #248 preserves state when compaction happens.
  • Proposed: brain decides "compact at THIS turn boundary" — when context is at 60% AND we just shipped a logical unit AND the next turn looks independent. Compaction at a clean boundary preserves far more useful state than compaction at the middle of a multi-step refactor.
  • Surfaces as an active action, not a passive nudge.

3. Auto session lifecycle (new-session on task boundary)

4. Execution locus (local vs hosted)

  • New axis Brain: tiered model routing (Haiku for obvious calls, escalate on ambiguity) #240 doesn't cover. Privacy-sensitive turns (touching .env, secrets, internal docs, customer data) stay on a local model; everything else goes remote.
  • Market signal: "Hugging Face co-founder says Qwen 3.6 27B running on airplane mode is close to latest Opus" — 2.3K + 2.0K cross-post upvotes. Local-model appetite is real and growing.
  • Implementation: extend the brain's model-selection action with a locus dimension; configure default-local globs (.env*, secrets/**, etc.) in the same shape as deny rules.

Why one epic (not 4 issues)

  • They share a single controller surface — a new brain::router module that takes (turn context, available models) and returns (model, locus, plus optional pre-actions like compact-now).
  • They share the same risk framing (see below) — solving it once is much better than re-litigating per axis.
  • They share the same telemetry: every meta-decision should be logged so we can measure whether the brain actually saves time/cost vs human baseline.

Risk framing (load-bearing — must solve before any axis ships on-by-default)

Promoting brain from advisor to controller introduces a new failure mode: a wrong meta-decision (Haiku on a hard task, compaction at the wrong boundary, new-session that lost a load-bearing thread) produces a worse outcome and the user blames us.

Required mitigations before any axis lands:

  1. Escalation on uncertainty. Routed turn's output triggers a verification pass; mismatch → re-run on the stronger model with the cheap-run output as context. Cost ceiling: at most 2× a baseline run on uncertainty, never higher.
  2. Always overridable. Single keystroke in TUI to disable any axis for the current session. Persistent meta.disabled = [\"routing\", \"compaction\"] in ~/.claudectl/brain/.
  3. Transparent surface. Every meta-decision shows a "why this model / why compact now / why new session" line. Pairs with Brain: /why command — explain decision lineage (preferences, few-shots, hive units, context) #243 (/why).
  4. Shadow mode first. Each axis ships in shadow mode (brain logs what it would have done, doesn't act) for a release cycle. Action mode only after shadow-mode metrics show net improvement.

Cross-references

Decomposition (once design lands)

Test plan (epic-level)

  • Shadow-mode metrics: % of turns where router agrees with human's actual choice (baseline); cost/latency delta if router had acted.
  • Escalation correctness: synthetic hard-task → Haiku → verification fails → Opus re-run produces correct output.
  • Override path: single keystroke disable persists across restart.
  • Privacy locus: synthetic .env-touching turn routes local even if remote is "cheaper."

Priority

P1 — high market intensity (cost + latency are the universal pains), strong fit with existing brain primitives, but gated on the risk-framing work (shadow mode, escalation, overrides). Lands E1.0 first; rest decomposes once design proves out.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High priorityarchitectureCore architecture decisionsbrainLocal LLM brain subsystemenhancementNew feature or requestperformancePerformance critical path

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions