fix(omni): use --resume for respawn with JSONL-missing fallback#2489
fix(omni): use --resume for respawn with JSONL-missing fallback#2489rodriguess-caio wants to merge 12 commits into
Conversation
Claude Code fails when asked to --resume a session whose JSONL file is missing (e.g. after a cleanup or on a fresh machine). By rewriting the flag to --session-id we keep the same identifier but force a fresh session, which always succeeds.
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request refactors how complex tmux commands are spawned by moving the logic to write a temporary launch script (writeTmuxLaunchScript) into a shared utility file (src/lib/tmux-launch-script.ts). It also updates the Claude Code executor to use this script-based launch mechanism to prevent command corruption, and adds unit and smoke tests to validate the fix. The review feedback highlights critical issues where unquoted script paths containing spaces (such as in user home directories) can cause the source command to fail in tmux, and suggests a more robust regular expression for replacing the --resume flag with --session-id.
| const cmd = envPrefix ? `${envPrefix} ${launch.command}` : launch.command; | ||
| await executeTmux(`send-keys -t '${paneId}' ${shellQuote(cmd)} Enter`); | ||
| const scriptPath = writeTmuxLaunchScript(`omni-${chatId}`, cmd); | ||
| await executeTmux(`send-keys -t '${paneId}' "source ${scriptPath}" Enter`); |
There was a problem hiding this comment.
If the user's home directory path contains spaces (e.g., /Users/John Doe), the unquoted scriptPath will cause the source command to fail in the tmux pane shell, preventing Claude Code from launching. Wrapping scriptPath in shellQuote ensures the path is safely escaped.
| await executeTmux(`send-keys -t '${paneId}' "source ${scriptPath}" Enter`); | |
| await executeTmux("send-keys -t '" + paneId + "' \"source " + shellQuote(scriptPath) + "\" Enter"); |
|
|
||
| // Force --session-id instead of --resume so Claude Code creates a fresh | ||
| // session rather than failing when the resumed session JSONL is missing. | ||
| const safeCommand = fullCommand.replace(/--resume\s+'([^']+)'/, "--session-id '$1'"); |
There was a problem hiding this comment.
The current regular expression /--resume\s+'([^']+)'/ assumes that the --resume flag is always wrapped in single quotes. If the command quoting style changes or if it is invoked with double quotes or no quotes (e.g., in manual/test runs), the replacement will silently fail, and the dead pane issue will persist. Using a more robust regex with a replacer function handles all quoting styles (single, double, or none) safely.
| const safeCommand = fullCommand.replace(/--resume\s+'([^']+)'/, "--session-id '$1'"); | |
| const safeCommand = fullCommand.replace(/--resume\\s+(?:'([^']+)'|\"([^\"]+)\"|(\\S+))/, (_, g1, g2, g3) => "--session-id '" + shellQuote(g1 || g2 || g3) + "'"); |
| const scriptPath = writeTmuxLaunchScript('smoke-2486', nastyCommand); | ||
| // Replicate the NEW Genie path: | ||
| // executeTmux(`send-keys -t '${paneId}' "source ${scriptPath}" Enter`) | ||
| execSync(`tmux -L ${socket2} send-keys -t '${pane2}' "source ${scriptPath}" Enter`); |
There was a problem hiding this comment.
If the test is run on a machine where the home directory path contains spaces, the unquoted scriptPath will cause the source command to fail in the tmux pane. Wrapping it in single quotes ensures the path is handled correctly.
| execSync(`tmux -L ${socket2} send-keys -t '${pane2}' "source ${scriptPath}" Enter`); | |
| execSync("tmux -L " + shellQuote(socket2) + " send-keys -t '" + shellQuote(pane2) + "' \"source '" + shellQuote(scriptPath) + "'\" Enter"); |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 771e1c3c4f
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
|
||
| // Force --session-id instead of --resume so Claude Code creates a fresh | ||
| // session rather than failing when the resumed session JSONL is missing. | ||
| const safeCommand = fullCommand.replace(/--resume\s+'([^']+)'/, "--session-id '$1'"); |
There was a problem hiding this comment.
Preserve valid resume launches instead of forcing fresh sessions
When an Omni chat already has existingExecutor.claudeSessionId, buildOmniSpawnParams intentionally emits --resume so the respawned pane reattaches to the existing transcript, but this unconditional replacement changes every such launch into --session-id. In the normal crash/restart path where the JSONL still exists, the pane no longer resumes prior context even though the executor row keeps returning the old session id for future spawns, so permanent Omni chats lose conversation history rather than reattaching.
Useful? React with 👍 / 👎.
…cripts" This reverts commit 771e1c3.
buildOmniSpawnParams now always emits sessionId (never resume), so buildLaunchCommand produces --session-id <id>. Unlike --resume, this flag attaches to an existing JSONL transcript when present but gracefully starts a fresh session with the same id when the transcript is missing (e.g. after cleanup or on a fresh machine) — preventing hard failures on respawn. Fix is applied at the source (where the command is built), not in the transport layer (tmux-launch-script), which is now provider-agnostic. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When respawning a per-chat agent with a prior Claude session id, emit --resume (not --session-id) so Claude reattaches to the existing JSONL. Add a 3s liveness check after launch: if --resume silently fails (JSONL missing), fall back to a fresh --session-id so the inbound message is not lost. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Problem
When the Omni bridge respawns a per-chat agent and a prior Claude session id exists, the bridge was emitting
--session-idinstead of--resume, so Claude never reattached to the existing JSONL transcript. Operator invariant broken: omni-bridged chats should be permanent until explicitly closed.Additionally,
--resumesilently fails when the JSONL is missing (cleanup, fresh machine) — Claude exits immediately and the pane returns to the shell with no error output.Solution
buildOmniSpawnParamsto emitresume(notsessionId) when a prior Claude session id exists, sobuildLaunchCommandgenerates--resume <id>--resumelaunch: if Claude exits immediately (JSONL missing), detect viaisPaneProcessRunningand fall back to a fresh--session-idso the inbound message is not lostsendToPanehelper to avoid duplicating the env-prefix + script-write + send-keys flowTest plan
bun test src/services/executors/claude-code.test.ts— 45 tests passCloses #2486
🤖 Generated with Claude Code