feat(conversations): server-authoritative resumable chat turns + tab rewrite#302
Open
Ovaculos wants to merge 21 commits into
Open
feat(conversations): server-authoritative resumable chat turns + tab rewrite#302Ovaculos wants to merge 21 commits into
Ovaculos wants to merge 21 commits into
Conversation
Decouple a chat turn's lifecycle from the originating HTTP request. A turn
now runs to completion on the server regardless of the client connection;
the client is a viewer that attaches to a replayable per-conversation event
log (RunBus), replays the in-flight turn, then tails live events. Refresh /
conversation-switch / disconnect never lose or duplicate work — they detach
and re-attach. Only the Stop button (RunBus.cancel) aborts generation.
- RunBus: in-memory, replayable, monotonic-seq turn log with grace-window
retention for late re-attach; threads its own AbortSignal into the engine.
- runtime.startTurn: detached turn driver; seeds user.message into the log;
reports isTurnActive / activeConversationIds / turnSeq / getTurnReplay.
- Per-conversation SSE: ConversationEventManager + /v1/conversations/:id/events
route with subscribed{isActive} frame + buffered replay; /v1/chat/start and
cancel endpoints.
- Broadcast conversation.title + data.changed on the global SSE when a title
generates, so viewers update without a turn-stream connection.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…imbleBrainInc#253) The single-string transcript prompt made the fast model pattern-match a conversation to continue and emit the start of the assistant's response as the title (worst on creative/long answers). Use real role turns (user → assistant → user-instruction); the trailing user turn is an unambiguous instruction, not text to continue. Closes NimbleBrainInc#253 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the single shared useChat with a per-conversation slice store. Each conversation owns its own state; a turn's stream writes only into its origin slice, so switching conversations mid-turn never bleeds the response into the destination chat (NimbleBrainInc#254) and switching back shows it still arriving. - chat-store: slice map (LRU-capped) backing useChat via useSyncExternalStore; sendTurn → /v1/chat/start then subscribe via conversation-stream (replay + live tail). Resume reflects server isActive (indicator/Stop survive reload), trims the stale in-flight turn from disk, and drops grace-buffer replay of a finished turn (no duplicate). AbortSignal threaded for cleanup only. - Streaming dots: store streamingIds → hostContext (SlotRenderer) so the list shows live per-row activity; persisted per-tab in sessionStorage and re-probed on reload (dots survive refresh, self-heal when finished). - Reload restore: last-viewed conversation reopened from sessionStorage. - Live title: consume conversation.title SSE → slice title; ChatPanel header prefers the generated title. MessageList lands the view at the bottom. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Per-row streaming indicator: read host-pushed streamingConversationIds (useHostContext) and render a pulsing dot on conversations with an in-flight turn. - No more list flicker on updates: data.changed refreshes run in the background (no skeleton swap) and only for conversation changes — other apps' data.changed are ignored. Skeleton stays for initial load + view switches. - New conversations appear immediately: ConversationIndex.flushPending() picks up a just-created file on the read path instead of racing the fs.watch debounce. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The per-conversation viewer replaced the old streaming path, leaving the client-side pieces unreferenced: - client.ts: streamChat / streamChatMultipart / consumeSSEStream (callers now use startChatTurn + connectConversationStream). - conversation-sse.ts and useConversationEvents.ts (the old cross-tab SSE hook) — zero importers. The server `/v1/chat/stream` + handleChatStream stay (still a valid REST surface, exercised by integration tests). Also comment the one ported `as unknown as` cast for the done-payload `files` field. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…prompt change
The capturing-model test helper skipped auto-title model calls by matching
the old prompt string ("Generate a 3-6 word title"). The NimbleBrainInc#253 rework changed
the title system prompt, so the guard stopped recognizing title calls and they
polluted the captured chat system prompt. Match the new prompt instead.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Detached turns emit events after the originating request ends; if the workdir is gone (test teardown) or a write otherwise fails (disk full, perms), the unguarded appendFileSync threw out of emit() as an unhandled error. Wrap the write best-effort — a failed log line must never crash the caller. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
RunBus.cancel() flips the run to terminal synchronously, after which publish() is a no-op. The engine's post-abort `cancelled` publish (startTurn's catch) therefore never reached the SSE feed (onTurnEvent), so the Stop button aborted generation but left the UI stuck streaming until a reload. done/error work because they publish before ending while the run is still active. Publish `cancelled` in cancelTurn BEFORE RunBus.cancel ends the run. The engine's later cancelled publish becomes a harmless no-op. Adds a regression test on the live onTurnEvent path (the attach/onEnd path the old tests used did not cover this). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
startTurn → store.load throws ConversationCorruptedError for a pre-migration (ownerless) conversation on the resume path, but handleChatStart let it fall through to a raw 500. handleChat and handleChatCancel both already map it to 422 via conversationCorruptedResponse — match them. Adds an HTTP test asserting /v1/chat/start on an ownerless conversation returns 422. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two concurrent starts with the same not-yet-existing conversationId could both
load()->null then both create({id}) — and create is a truncating writeFile with
no exists-guard, so the loser would clobber the winner's in-progress file.
runBus.begin serializes the run, but it ran AFTER create, too late to help.
For a provided id, reserve the run (begin) BEFORE touching storage: the loser
now throws RunInProgressError before it can create. On any load/create failure
we evict to release the reservation so a failed start can't leave the id stuck
"running". Adds a regression test (delayed load forces the window; asserts
create runs exactly once).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
connectConversationStream reconnects with backoff up to 30s; RunBus GCs a
terminal turn 30s after it ends. If a viewer is disconnected past that window
while the turn finishes, the reconnect's subscribed{isActive:false} carries no
replayed terminal frame — and the client only reconciled on the first (resume)
subscribed, so the slice stayed isStreaming=true until a manual reload.
Reconcile on any subscribed frame: if the server reports no active turn while
the slice still thinks it's streaming, clear the streaming state (the server is
authoritative). A terminal frame still within grace replays right after and
finalizes the content; if GC'd, the partial is kept and a reload fetches the
final — either way the spinner no longer hangs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…icker) loadConversation reads disk then subscribes. A short turn ending in that window left a partial snapshot, and onSubscribed hit the drop branch — discarding the grace-buffer replay that held the full `done`, so the final response only showed after a reload. Mark the gap precisely instead of always recombining (which would flicker every recently-finished conversation on open): the reader flags an assistant turn with no run.done/run.error as `pending`. On resume the client now branches on it: - live turn, or pending tail + retained run (activeSeq>0) → trim + apply replay (rebuilds the full turn, no duplicate); - pending tail + run GC'd → refetch the now-complete transcript; - complete tail → ignore the grace replay (no dup, no flicker — the common open-a-recent-conversation path is untouched). A trailing user message (no assistant yet) also counts as pending. probe slices have no messages, so they're unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
conversation-stream viewers track the RunBus seq (id: line) for replay/resume. The legacy /v1/chat and /v1/chat/stream endpoints fan out via broadcastToConversation, which is seq-less — those frames apply live but don't advance lastSeq and have no resume. The web shell is RunBus-only; only an external caller hitting the legacy endpoints while a web tab watches the same conversation would mix the two. Document the boundary on both sides. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The earlier hardening (commit 9ddfd2e) silenced log-sink throws to keep emit() safe, but a real disk-full / perms incident produced zero operator signal. Surface the first write failure of an episode with console.warn, then suppress until a subsequent success re-arms — first failure visible, sustained outage doesn't spam, intermittent recovery re-warns when it breaks again. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A viewer landing on a different pod than the one running the turn sees isActive:false and silently drops resume — RunBus is single-process and isn't yet clustered. Sticky routing on Mcp-Session-Id (prereq NimbleBrainInc#2) mitigates for the active tab; a pod restart or any cross-pod viewer still loses the in-flight turn. Two changes so this isn't a silent regression: - AGENTS.md: add a "Known limitation under replicas > 1" note under the four prereqs, pointing at run-bus.ts and the deferred clustered RunBus. - serve.ts: emit a loud startup warn when sessionStore.type === "redis" (the one strong signal of multi-replica intent) so operators see the gap at boot. Not a hard error — sticky routing is sufficient for many cases, and operators may accept the limit until the clustered RunBus lands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…owth A runaway or adversarial event producer — a model stream looping pathologically, a chatty tool spamming progress events — could push log.events past safe memory limits before the grace-window GC fired. Tens of thousands of token deltas held in-process for the grace window on a multi-tenant deployment is real heap pressure. Cap at 500k events per run with discard-as-error semantics: on overflow, abort the turn's signal, append a synthetic terminal `error` event with `error: "buffer_overflow"` so viewers see the cause, end the run as `error`, and warn once. Subsequent publishes drop via the standard `status !== "running"` guard. 500k sits ~10x over the worst legit run we can measure (extended-thinking turn with chatty tool progress: ~30-50k events). Hitting it requires the kind of pathological producer that is exactly what this cap exists to terminate cleanly. Configurable via the RunBus constructor for tests; production leaves it at the default. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`Runtime.start({})` defaulted `workDir` to `~/.nimblebrain`, so any
integration test that forgot the field silently wrote echo-model
conversations, test workspaces, and bundle data straight into the
developer's real workdir. They then showed up in the conversations tab
on `bun run dev` — confusing diagnosis that looked like a title-pipeline
bug but was really 268 of 282 convs being test fixtures with
`workspaceId: "ws_test"`.
`resolveWorkDir` now throws under `NODE_ENV=test` (bun:test sets this
automatically) when `workDir` is missing, with a clear example fix.
Catches future regressions of this category at construction time.
Eight test files were leaking:
- chat-stream-concurrent
- appcontext-wiring (3 Runtime.starts)
- dependency-checking (6 starts; share file-scope testDir)
- conversation-integration
- remote-integration (3 starts)
- runtime (7 starts; one needed its own workDir to avoid skill bleed
from earlier same-file tests writing to the shared global skill dir)
- security/workspace-isolation
- usage-integration
New helper `test/helpers/test-workdir.ts` exposes `makeTestWorkDir(label)`
that returns `{workDir, cleanup}` for tests that want per-test isolation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The runtime fired `emitConversationsChanged()` three times on a first-turn chat: when isNew (sidebar row appears), at .finally (terminal state), and again when the title resolved. Each broadcast triggered the conversations-list iframe to refetch the whole list. The title-resolve refetch is now redundant. The `conversation.title` SSE event already exists (the chat panel header consumes it for live title display); the list iframe just didn't have a channel for it. Add one: the web shell forwards `conversation.title` via postMessage to matching iframes, and the conversations bundle listens with a raw `window.addEventListener` (the synapse SDK doesn't know this envelope, which is fine — its own listener ignores unrecognized methods, no double-handling). The iframe patches the matching row's title in-place, no list refetch. Drop the now-redundant `emitConversationsChanged()` from the title-resolve handler. Broadcast count drops from 3 to 2 on first turn; subsequent turns unchanged at 1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…dle src Two unrelated cleanups: - `web/src/api/client.ts`: comments still referenced the deleted `streamChat` / `streamChatMultipart` helpers (uploadResource header pattern note + startChatTurn docblock). Rot-flagged only; no behavior change. - `scripts/check-code-style.ts`: the doc comment says "bundles are out of scope" but the glob walked `src/**/*.ts`, which after `bun run build:bundles` pulls in `src/bundles/*/ui/node_modules/` and surfaces 1625 false-positive inline-type-import violations in vendored `.d.ts` files. Skip `node_modules/` and `src/bundles/` per the documented scope. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tab-rework # Conflicts: # src/api/routes/chat.ts # src/conversation/auto-title.ts # web/src/api/client.ts # web/src/api/conversation-sse.ts # web/src/components/AppWithChat.tsx # web/src/components/ChatPanel.tsx # web/src/context/ChatContext.tsx # web/src/hooks/useChat.ts # web/test/inlineError.test.tsx # web/test/streamingState.test.tsx
…tab-rework
Conflicts resolved in:
- web/test/inlineError.test.tsx
- web/test/streamingState.test.tsx
Adopted upstream's realClient snapshot pattern from web/test/setup.ts
(safer than `await import("../src/api/client")` per the setup.ts comment)
while preserving the branch's server-authoritative mocks
(startChatTurn / startChatTurnMultipart / cancelChatTurn +
connectConversationStream).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Rewrite the conversation tab around a server-authoritative resumable turn model (issue #254). A chat turn now runs to completion on the server regardless of the client's connection; the browser is a viewer that attaches to a per-conversation event log, replays the in-flight turn, and tails live events.
Closes #254. Includes the #253 follow-up (auto-title prompt) — upstream landed an XML-containment + sanitizer approach in #261 that is preserved here.
What this delivers
src/runtime/run-bus.ts) — per-conversation in-memory log of buffered events with monotonic seq, ownership of the turn'sAbortController, and a grace window for late re-attach. 500k per-run event cap with terminalerroron overflow.startTurn—POST /v1/chat/startreturns{conversationId}immediately; the turn keeps running across disconnect / refresh / tab switch. Stop button →POST /v1/conversations/:id/cancel.GET /v1/conversations/:id/eventsreplays from a client-supplied?afterSeq=Nthen tails. Reconnect/backoff handled inconversation-stream.ts.web/src/hooks/chat-store.ts) —useSyncExternalStore-backed module singleton, LRU cap,hasPendingTailfor resume reconcile,probeConversationfor sessionStorage-restored streaming dots.{workDir}/conversations/{id}.jsonl, user-scoped ownership,check:conversation-pathsenforces it.synapse/conversation-titlepostMessage channel (no full list refetch on title resolve).Known limitation: RunBus is single-process
Under
platform.replicas > 1, a viewer landing on a different pod than the one that started the turn seesisActive:falseand the live frames don't fan out cross-pod. Sticky routing onMcp-Session-Id(existing prereq) mitigates the active-tab case; a pod restart or cross-pod viewer still drops resume mid-turn. A clustered Redis-backed RunBus is deferred — tracked insrc/runtime/run-bus.ts.servewarns at boot whensessionStore.type === "redis"so the gap is visible.Other fixes folded into this PR
cancelledframe to live viewers (was: just aborted, viewer stuckisStreaming:true).startTurnserializes create to prevent double-create race when client supplies an id.ConversationCorruptedError→ 422 inhandleChatStart(was: 500).structured-log-sink,workspace-log-sink) wrap writes in try/catch + warn-once-per-episode (was: throw into the event-emit path and crash detached turns).conversation.titleSSE event (withwsIdscoping). List iframe receives via dedicated postMessage; chat panel header updates via existing chat-store hook.isActive/activeSeq>0/ GC'd).isActive:falseonsubscribedframe drives chat-store reset.Tests & hygiene
Runtime.start({})now hard-errors underNODE_ENV=testwhenworkDiris missing. Stops integration tests from silently writing echo-model conversations into the developer's real~/.nimblebrain. Fixed 8 leakers + addedtest/helpers/test-workdir.ts.scripts/check-code-style.tsskipsnode_modules/andsrc/bundles/per its doc comment (was: scanned vendored.d.tsafterbuild:bundles, surfacing 1625 false-positive violations).Test plan
bun run verify— full CI parity, all greensessionStore.type: "redis"startup warning fires when configured🤖 Generated with Claude Code