Brain: meta-decisioning layer for session lifecycle & routing

## Summary

Promote the brain from "supervisor that warns" to "controller that decides" along four meta-axes: model routing (latency), compaction timing, session lifecycle (when to start fresh), and execution locus (local vs hosted). The unifying claim: every Claude Code session loses time and money to bad meta-decisions made (or not made) by the human, when the brain already has the signals to decide better.

This is an **epic**: shared controller surface, shared risk framing, decomposable into 4 sub-issues once design lands.

## The four axes

### 1. Latency-aware model routing
- Today: brain rules can `route` actions (#240) but the dimension is cost, not speed.
- Proposed: classify each turn's likely difficulty (file read, simple grep, "is X true" → trivial; multi-file refactor, novel design → hard). Trivial turns → Haiku or a local model; escalate to Sonnet/Opus on uncertainty or post-hoc verification mismatch.
- Inputs already available: `diff_digest.rs` risk score, `decisions.rs` few-shot retrieval, transcript turn type, tool call type from `monitor.rs`.

### 2. Proactive compaction triggers
- Today: `health.rs` nudges the user at 50% cognitive decay; #248 preserves state *when* compaction happens.
- Proposed: brain decides "compact at THIS turn boundary" — when context is at 60% AND we just shipped a logical unit AND the next turn looks independent. Compaction at a clean boundary preserves far more useful state than compaction at the middle of a multi-step refactor.
- Surfaces as an active action, not a passive nudge.

### 3. Auto session lifecycle (new-session on task boundary)
- Today: #245 is about *forking away from a bad path*. This is different: detect "we finished feature A, the next ask is feature B" and seed a fresh session with a brain-built briefing.
- Reuses `briefing.rs` (#246) — the briefing primitive already exists; the missing piece is the *trigger*.
- Market signal: r/ClaudeCode 396-upvote *"built our entire product, nobody understands what we built"* — top reply names every-session reset as the core pain. Users know they should `/clear` but the timing is judgment. The brain can take that judgment off their plate.

### 4. Execution locus (local vs hosted)
- New axis #240 doesn't cover. Privacy-sensitive turns (touching `.env`, secrets, internal docs, customer data) stay on a local model; everything else goes remote.
- Market signal: *"Hugging Face co-founder says Qwen 3.6 27B running on airplane mode is close to latest Opus"* — 2.3K + 2.0K cross-post upvotes. Local-model appetite is real and growing.
- Implementation: extend the brain's model-selection action with a locus dimension; configure default-local globs (`.env*`, `secrets/**`, etc.) in the same shape as deny rules.

## Why one epic (not 4 issues)

- They share a single controller surface — a new `brain::router` module that takes (turn context, available models) and returns (model, locus, plus optional pre-actions like compact-now).
- They share the same risk framing (see below) — solving it once is much better than re-litigating per axis.
- They share the same telemetry: every meta-decision should be logged so we can measure whether the brain actually saves time/cost vs human baseline.

## Risk framing (load-bearing — must solve before any axis ships on-by-default)

Promoting brain from advisor to controller introduces a new failure mode: a wrong meta-decision (Haiku on a hard task, compaction at the wrong boundary, new-session that lost a load-bearing thread) produces a worse outcome and the user blames us.

Required mitigations before any axis lands:

1. **Escalation on uncertainty.** Routed turn's output triggers a verification pass; mismatch → re-run on the stronger model with the cheap-run output as context. Cost ceiling: at most 2× a baseline run on uncertainty, never higher.
2. **Always overridable.** Single keystroke in TUI to disable any axis for the current session. Persistent `meta.disabled = [\"routing\", \"compaction\"]` in `~/.claudectl/brain/`.
3. **Transparent surface.** Every meta-decision shows a "why this model / why compact now / why new session" line. Pairs with #243 (`/why`).
4. **Shadow mode first.** Each axis ships in shadow mode (brain logs what it *would* have done, doesn't act) for a release cycle. Action mode only after shadow-mode metrics show net improvement.

## Cross-references

- Builds on / does not duplicate: #240 (cost routing), #248 (PreCompact state preservation), #245 (session forking), #246 (briefing primitive — shipped), #237 (diff-aware risk — shipped), #243 (/why surface).
- Pairs with: #253 (rate-limit forecast — triggers routing tier-down when wall is near), #252 (spend forensics — turns meta-decisions into auditable cost-savings).
- Adjacent: #140 (incident post-mortem framework) — wrong meta-decisions need a structured way to learn from.

## Decomposition (once design lands)

- **E1.1** Latency-aware routing (axis 1) — depends on shared router surface.
- **E1.2** Proactive compaction triggers (axis 2) — depends on shared router surface + #248.
- **E1.3** Auto session lifecycle (axis 3) — depends on shared router surface + #246 (done).
- **E1.4** Local-vs-hosted routing (axis 4) — depends on shared router surface + locus config schema.
- **E1.0** Shared `brain::router` module + shadow-mode telemetry + override surface (lands first; blocks E1.1–E1.4).

## Test plan (epic-level)

- [ ] Shadow-mode metrics: % of turns where router agrees with human's actual choice (baseline); cost/latency delta if router had acted.
- [ ] Escalation correctness: synthetic hard-task → Haiku → verification fails → Opus re-run produces correct output.
- [ ] Override path: single keystroke disable persists across restart.
- [ ] Privacy locus: synthetic `.env`-touching turn routes local even if remote is "cheaper."

## Priority

**P1** — high market intensity (cost + latency are the universal pains), strong fit with existing brain primitives, but gated on the risk-framing work (shadow mode, escalation, overrides). Lands E1.0 first; rest decomposes once design proves out.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Brain: meta-decisioning layer for session lifecycle & routing #258

Summary

The four axes

1. Latency-aware model routing

2. Proactive compaction triggers

3. Auto session lifecycle (new-session on task boundary)

4. Execution locus (local vs hosted)

Why one epic (not 4 issues)

Risk framing (load-bearing — must solve before any axis ships on-by-default)

Cross-references

Decomposition (once design lands)

Test plan (epic-level)

Priority

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Brain: meta-decisioning layer for session lifecycle & routing #258

Description

Summary

The four axes

1. Latency-aware model routing

2. Proactive compaction triggers

3. Auto session lifecycle (new-session on task boundary)

4. Execution locus (local vs hosted)

Why one epic (not 4 issues)

Risk framing (load-bearing — must solve before any axis ships on-by-default)

Cross-references

Decomposition (once design lands)

Test plan (epic-level)

Priority

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions