Skip to content

fix(omni): use --resume for respawn with JSONL-missing fallback#2490

Closed
rodriguess-caio wants to merge 6 commits into
automagik-dev:mainfrom
rodriguess-caio:fix/omni-spawn-script-path
Closed

fix(omni): use --resume for respawn with JSONL-missing fallback#2490
rodriguess-caio wants to merge 6 commits into
automagik-dev:mainfrom
rodriguess-caio:fix/omni-spawn-script-path

Conversation

@rodriguess-caio
Copy link
Copy Markdown
Contributor

@rodriguess-caio rodriguess-caio commented May 27, 2026

Summary

  • Reverts buildOmniSpawnParams to emit resume (not sessionId) when a prior Claude session id exists, so buildLaunchCommand generates --resume <id> and Claude reattaches to the existing JSONL transcript
  • Extracts sendToPane helper inside launchOmniProcessInPane to avoid duplicating the env-prefix + script-write + send-keys flow
  • Adds a 3-second liveness check after a --resume launch: if Claude exits immediately (JSONL missing — e.g. after cleanup or on a fresh machine), detects the silent failure via isPaneProcessRunning and falls back to a fresh --session-id so the inbound message is not lost

Test plan

  • bun test src/services/executors/claude-code.test.ts — 45 tests pass
  • Respawn an existing chat: confirm Claude reattaches to prior conversation
  • Delete the JSONL manually and respawn: confirm fallback to fresh session, message not lost
  • Verify no double-spawn on healthy resume (process running after 3s settle)

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Improved process spawning reliability and robustness when handling complex commands with special characters.
  • Bug Fixes

    • Enhanced session resume behavior with automatic fallback to fresh session creation when resume encounters issues.
  • Tests

    • Added smoke tests verifying spawn command stability.
    • Added unit tests for launch script generation.
  • Chores

    • Development release version updated to 4.260525.2.

Review Change Stack

release-bot and others added 6 commits May 26, 2026 18:34
Claude Code fails when asked to --resume a session whose JSONL file is
missing (e.g. after a cleanup or on a fresh machine). By rewriting the
flag to --session-id we keep the same identifier but force a fresh
session, which always succeeds.
buildOmniSpawnParams now always emits sessionId (never resume), so
buildLaunchCommand produces --session-id <id>. Unlike --resume, this
flag attaches to an existing JSONL transcript when present but gracefully
starts a fresh session with the same id when the transcript is missing
(e.g. after cleanup or on a fresh machine) — preventing hard failures on
respawn. Fix is applied at the source (where the command is built), not
in the transport layer (tmux-launch-script), which is now provider-agnostic.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When respawning a per-chat agent with a prior Claude session id, emit
--resume (not --session-id) so Claude reattaches to the existing JSONL.
Add a 3s liveness check after launch: if --resume silently fails (JSONL
missing), fall back to a fresh --session-id so the inbound message is
not lost.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 27, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6fe1798b-4e0e-4dbb-aad8-bccaa4086b04

📥 Commits

Reviewing files that changed from the base of the PR and between d315037 and dc6177b.

📒 Files selected for processing (8)
  • .well-known/dev.json
  • scripts/tests/omni-spawn-smoke.ts
  • src/lib/__tests__/tmux-launch-script.test.ts
  • src/lib/tmux-launch-script.ts
  • src/lib/tmux.ts
  • src/services/executors/claude-code.test.ts
  • src/services/executors/claude-code.ts
  • src/term-commands/agents.ts

📝 Walkthrough

Walkthrough

This PR refactors tmux command spawning to generate and source temporary shell scripts instead of injecting commands directly via tmux send-keys. This solves command escaping failures on complex payloads (backticks, emoji, parentheses, nested quotes). The solution includes a centralized utility, executor and agents integration with resume verification logic, and comprehensive verification tests.

Changes

Script-based tmux spawning and executor integration

Layer / File(s) Summary
Core utility module and unit tests
src/lib/tmux-launch-script.ts, src/lib/__tests__/tmux-launch-script.test.ts
writeTmuxLaunchScript creates ~/.genie/spawn-scripts, sanitizes workerId filenames, writes executable shell scripts with restrictive permissions, and preserves complex command strings. Unit tests verify script generation, permissions, directory structure, and preservation of backticks and nested quotes.
Executor integration with resume verification
src/services/executors/claude-code.ts, src/services/executors/claude-code.test.ts, src/lib/tmux.ts
launchOmniProcessInPane now generates and sources tmux launch scripts instead of injecting raw commands. After resume-flagged spawn, it verifies the provider process is running; on silent failure (missing JSONL), logs warning and falls back to fresh session spawn. isPaneProcessRunning enhanced to include ps command in descendant probing. Test docs clarified for resume and sessionId behavior.
Agents module integration
src/term-commands/agents.ts
Imports writeTmuxLaunchScript from the new shared module instead of defining it locally. Local 22-line implementation removed; spawn logic unchanged.
Smoke test: old vs. new spawn behavior
scripts/tests/omni-spawn-smoke.ts
Bun test comparing inline tmux send-keys (fails on complex commands) against script sourcing (handles complex payloads). Constructs deliberately problematic payload with backticks, emoji, parentheses, nested quotes, runs both paths, captures pane output and filesystem markers, exits with status codes indicating success/failure combinations.

🎯 3 (Moderate) | ⏱️ ~20 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch fix/omni-spawn-script-path

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@rodriguess-caio
Copy link
Copy Markdown
Contributor Author

Closing — superseded by #2489.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the tmux launch script logic by extracting writeTmuxLaunchScript into a shared module (src/lib/tmux-launch-script.ts) to prevent command corruption during complex tmux spawns. It also introduces unit and smoke tests, updates ClaudeCodeOmniExecutor to use this script-based path with a fallback mechanism for failed session resumes, and refines process tracking in isPaneProcessRunning. The review feedback points out two important issues: first, isPaneProcessRunning should also check the panePid itself to support directly-executed processes; second, the exec prefix should be restored in the shared launch script to prevent wrapper shell processes from lingering in the background.

Comment thread src/lib/tmux.ts
// Check direct children and grandchildren for the target process name
const output = exec(
`pgrep -la -P ${panePid} 2>/dev/null; for cpid in $(pgrep -P ${panePid} 2>/dev/null); do pgrep -la -P "$cpid" 2>/dev/null; done; true`,
`pgrep -la -P ${panePid} 2>/dev/null; for cpid in $(pgrep -P ${panePid} 2>/dev/null); do pgrep -la -P "$cpid" 2>/dev/null; ps -p "$cpid" -o comm= 2>/dev/null; done; true`,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The isPaneProcessRunning function currently only checks the direct children and grandchildren of panePid to see if the target process is running. It does not check the panePid process itself.

If the target process is executed directly (e.g., using exec in a launch script or if the pane was created directly with the command), panePid itself will be the target process. In this case, isPaneProcessRunning will return false because it only queries descendants.

To make this check robust for both shell-wrapped and directly-executed processes, we should also check the command name of panePid itself using ps -p ${panePid} -o comm=.

Suggested change
`pgrep -la -P ${panePid} 2>/dev/null; for cpid in $(pgrep -P ${panePid} 2>/dev/null); do pgrep -la -P "$cpid" 2>/dev/null; ps -p "$cpid" -o comm= 2>/dev/null; done; true`,
`ps -p ${panePid} -o comm= 2>/dev/null; pgrep -la -P ${panePid} 2>/dev/null; for cpid in $(pgrep -P ${panePid} 2>/dev/null); do pgrep -la -P "$cpid" 2>/dev/null; ps -p "$cpid" -o comm= 2>/dev/null; done; true`,

mkdirSync(dir, { recursive: true });
const safeId = workerId.replace(/[^a-zA-Z0-9._-]/g, '-');
const scriptPath = join(dir, `${safeId}-${Date.now().toString(36)}.sh`);
writeFileSync(scriptPath, `#!/bin/sh\n${fullCommand}\n`, { mode: 0o700 });
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The extracted writeTmuxLaunchScript function is missing the exec prefix before ${fullCommand} that was present in the original implementation in src/term-commands/agents.ts.

Without exec, the shell wrapper process (/bin/sh) remains alive in the background waiting for the child process to exit. This wastes system resources (an extra process per spawned agent) and can interfere with signal propagation and PID tracking (since #{pane_pid} might refer to the wrapper shell instead of the actual agent process).

Please restore the exec prefix.

Suggested change
writeFileSync(scriptPath, `#!/bin/sh\n${fullCommand}\n`, { mode: 0o700 });
writeFileSync(scriptPath, "#!/bin/sh\nexec ${fullCommand}\n", { mode: 0o700 });

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dc6177bfde

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

.join(' ');
const cmd = envPrefix ? `${envPrefix} ${launch.command}` : launch.command;
const scriptPath = writeTmuxLaunchScript(`omni-${chatId}`, cmd);
await executeTmux(`send-keys -t '${paneId}' "source ${scriptPath}" Enter`);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid sourcing the launch script with a bash-only builtin

In tmux panes whose default shell is POSIX sh/dash, this sends source ... to the pane, but source is not available there, so omni spawns never start; non-resume spawns return a new session id without any liveness check, and resume fallback repeats the same failing command. ensureTeamWindow creates a plain tmux window without forcing bash, so this depends on the user's tmux/default shell. Execute the script path directly or use a POSIX-compatible invocation with proper quoting.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant