fix(scope-coherence): isolate implementation depth counter per session#133
fix(scope-coherence): isolate implementation depth counter per session#133brianjones-v4n wants to merge 1 commit into
Conversation
The scope-coherence hooks tracked implementationDepth as a single global counter shared across all sessions. This caused false-positive checkpoint triggers: a previous session's accumulated counter of 16 would push a new session past the 20-action threshold after only 4 actions, blocking the agent with a spurious "SCOPE COHERENCE CHECK" interruption. Key the state by session ID (INSTAR_SESSION_ID, with a pid-based fallback for manual terminal sessions). Each session starts at depth 0 and counts independently. Cooldown and dismissal tracking remain global since they prevent excessive nagging across sessions. Old sessions are pruned after 10 entries to prevent unbounded state file growth. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
@brianjones-v4n is attempting to deploy a commit to the sagemind Team on Vercel. A member of the Team first needs to authorize it. |
JKHeadley
left a comment
There was a problem hiding this comment.
Echo's Review — PR #133
Recommendation: merge (pending @JKHeadley approval — touches .instar/hooks/)
Value: moderate | Strategy: comment-only
Summary
This is a well-scoped fix for a real bug: a global implementationDepth counter in scope-coherence.json was bleeding across sessions, causing fresh sessions to trip the 20-action checkpoint after only ~4 tool calls. The PR moves per-session state under a new sessions map keyed by INSTAR_SESSION_ID, with manual-{pid} fallback for terminal sessions. Cooldown and dismissal counts are intentionally kept global (preventing nag-storms across sessions), with explanatory comments on the asymmetry.
Strengths
- Real bug, real repro — the scope-coherence false-trip is something we've hit too; the inline state inspection (
implementationDepth: 16carryover) matches the failure mode. - Touches both source-of-truth files — runtime hooks (
.instar/hooks/instar/*) AND thePostUpdateMigrator.tstemplate that generates them on update. Skipping the template would silently regress the fix on the next install. - Backward-compatible state shape — old flat-key state is ignored cleanly;
loadStatereturns{ sessions: {} }and the next write overwrites with the new shape. No migration needed. - Bounded growth —
MAX_SESSION_ENTRIES = 10with newest-first pruning prevents thesessionsmap from accumulating indefinitely. - Headless-job carve-out — the new
INSTAR_SESSION_ID && !TERM_PROGRAMearly-approve in the checkpoint hook avoids blocking unattended sessions where there's no human to dismiss the prompt. - Comments explain intent — the global-vs-per-session asymmetry is annotated so a future reader doesn't "fix" it.
Concerns
- No regression test for the new behavior (severity: moderate). The repo has integration/e2e tests for the scope-coherence API routes but not the hook scripts themselves, so this PR doesn't break any existing test, and it follows the prevailing pattern for hook coverage. Still, a small unit/integration test that drives the collector with two synthetic
INSTAR_SESSION_IDs and asserts depth isolation would lock in the fix and catch future regressions. - Concurrent file writes are not atomic (severity: minor).
loadState→ mutate →saveStateis racy if two sessions' PostToolUse hooks fire at the same instant. The prior global counter had the same race, but per-session isolation slightly increases the surface (a session can lose its first depth increment if another session writes between load and save). Worst case is the threshold takes longer to hit — not a correctness disaster, but worth noting. TERM_PROGRAMheuristic for headless detection is fragile (severity: minor). Custom terminals or some shell setups may not setTERM_PROGRAM, in which case interactive sessions would be treated as headless and skip the checkpoint entirely. A more durable signal (e.g., presence of stdin TTY, or an explicitINSTAR_HEADLESSflag set by the spawner) would be more robust. Acceptable for v1 if the intent is "best-effort silencing for jobs."
Test Coverage
No new tests. Existing API-route tests still pass (and remain unaffected — they don't exercise the hook scripts). The hook scripts themselves have always been untested in this repo, so this PR doesn't regress coverage but doesn't improve it either. Recommend a follow-up to add a small harness that pipes JSON to the collector hook and verifies the per-session state shape.
Verdict
The fix is correct, scoped, and well-explained. The only structural blocker is policy: this PR modifies .instar/hooks/, which our review policy classifies as security-critical and requires explicit @JKHeadley approval — Echo cannot approve. Once Justin signs off, this is safe to merge as-is. The follow-up I'd want to see (regression test for per-session isolation) can land separately.
Next Steps
- @JKHeadley to review and approve (security-path policy requires human approval)
- Optional follow-up PR: a small test that drives the collector with two
INSTAR_SESSION_IDvalues and asserts depth isolation + pruning atMAX_SESSION_ENTRIES
Thanks @brianjones-v4n — clean diff, clear repro, correct dual-update across the runtime hook and the migrator template. Appreciated.
Automated review by Echo — instar's developer agent. This review was generated by an AI system. For questions or concerns, please tag @JKHeadley.
JKHeadley
left a comment
There was a problem hiding this comment.
Stage 2 Deep Review — Recommend Merge
Reviewer: Echo (AI) — automated deep review, not an approval. Human maintainer decision required.
Summary
Tight, well-targeted fix for the false-positive scope-coherence checkpoint. The diagnosis (one global implementationDepth accumulating across sessions) matches the symptom (fresh sessions tripping the 20-action threshold after ~4 actions). Per-session keying is the right structural fix.
What I verified
- Live hook ↔ template parity. The new logic lands in three places —
scope-coherence-collector.js,scope-coherence-checkpoint.js, and the corresponding string templates insidePostUpdateMigrator.ts. Spot-checked the migrator output: both the headless-session bypass (INSTAR_SESSION_ID && !TERM_PROGRAM → approve) and thesessions: {}shape are present in the template, so fresh installs will get the same behavior as the live hooks. No drift. - Session ID source.
INSTAR_SESSION_IDis set by the server for spawned sessions;manual-{pid}is a reasonable fallback for terminal sessions. PID reuse is a theoretical concern but bounded byMAX_SESSION_ENTRIES = 10pruning plus the per-session 5-min settle window, so a stale reused entry can at worst delay one nag — not block. - Global vs per-session split. Cooldown (
lastCheckpointPrompt) and dismissal counter remain global, intentionally — preventing nag flood across sessions. The inline comment annotates this so a future reader doesn't "fix" it. Good. - Headless bypass. New: server-spawned/job sessions skip the block entirely (no TTY to show the message to). Correct — previously they'd silently lose tool actions to a UX prompt nobody could see.
- Pruning.
pruneSessions()sorts bysessionStartdescending and drops the tail past 10. Stable, no unbounded growth. - CI. All shards green on Node 20/22, Integration, E2E, Build. Vercel "failure" is an auth state, not a build failure.
Security / injection scan
No prompt-injection or instruction-override patterns in the diff. State-file I/O is wrapped in try/catch and writes a plain JSON object; no untrusted-content interpolation.
Non-blocking observation
The PR ships hook-script logic with no automated regression test. The change is simple enough that CI passing is a meaningful signal, but a unit test that simulates two concurrent sessions hitting the collector (and asserting they don't trip each other's counters) would prevent re-regression. Suggest as a follow-up, not a blocker.
Recommendation
Safe to merge. Internally consistent (live hook + template), behaviorally minimal for the happy path, fixes a real false-positive that's been observed in the wild. Thanks @brianjones-v4n.
🤖 Stage 2 review by Echo. AI agents do not approve PRs — final merge decision belongs to a human maintainer.
Summary
implementationDepth,sessionStart,sessionDocsRead, and related fields move from top-level keys to per-session entries under a newsessionsmap, keyed byINSTAR_SESSION_ID(with amanual-{pid}fallback for terminal sessions).MAX_SESSION_ENTRIES = 10).Problem
The scope-coherence hooks used a single global
implementationDepthcounter in.instar/state/scope-coherence.json. The counter was set once (if (!state.sessionStart)) and never reset between sessions. A prior session that accumulated depth 16 would leave the counter at 16, and the next session would trip the 20-action threshold after only 4 Edit/Write/Bash calls — producing a spurious "SCOPE COHERENCE CHECK" block.Additionally, concurrent sessions in the same project would share the same counter, causing cross-session interference: session A's writes could trip session B's checkpoint.
Files changed
.instar/hooks/instar/scope-coherence-collector.js— PostToolUse hook, tracks tool actions.instar/hooks/instar/scope-coherence-checkpoint.js— Stop hook, triggers scope zoom-outsrc/core/PostUpdateMigrator.ts— Template methods that generate the hook files on installDesign Decisions
INSTAR_SESSION_ID— set by the Instar server for spawned sessions. Falls back tomanual-{pid}for terminal sessions.MAX_SESSION_ENTRIES = 10— named constant with inline comment, not a magic number.Repro
A real-world hit: a fresh agent session got blocked by SCOPE COHERENCE CHECK after only ~4 tool actions. Inspection of
.instar/state/scope-coherence.jsonshowedimplementationDepth: 16left over from a prior session's work earlier in the day.🤖 Generated with Claude Code