Skip to content

Bug: set -e kills script on non-zero return from execute_claude_code — loop never retries #200

@chickenbeast

Description

@chickenbeast

Description

When execute_claude_code() returns a non-zero exit code (e.g., timeout exit code 1, or any other failure), the entire Ralph script terminates silently instead of retrying. The retry logic in the main loop is never reached.

This affects all failure modes — timeout, execution errors, etc. The loop should wait 30 seconds and retry, but instead Ralph just dies.

Root Cause

ralph_loop.sh starts with set -e (line 6), which causes the script to exit on any uncaught non-zero return.

In main(), execute_claude_code is called bare — not in a conditional context:

# Line ~1702
execute_claude_code "$loop_count"
local exec_result=$?

With set -e active, when execute_claude_code returns 1 (timeout) or any non-zero code, bash terminates the script immediately. The local exec_result=$? on the next line is never reached, and the retry logic in the if/elif/else block below it never fires.

Expected Behavior

On timeout or execution failure, Ralph should:

  1. Log "Execution failed, waiting 30 seconds before retry..."
  2. Sleep 30 seconds
  3. Continue to the next loop iteration

This is exactly what the existing code at the else branch (line ~1755) is designed to do — but it's unreachable.

Actual Behavior

Ralph exits silently after the first non-zero return from execute_claude_code. The log ends abruptly with no retry message, no "Loop #2" entry.

Example log output:

[WARN] Claude Code execution timed out after N minutes
[WARN] ⏱️ Claude Code execution timed out (not an API limit)
<EOF — script died>

Fix

Use the standard bash idiom for safely capturing exit codes under set -e:

         # Execute Claude Code
-        execute_claude_code "$loop_count"
-        local exec_result=$?
+        # Use || to capture exit code safely under set -e
+        local exec_result=0
+        execute_claude_code "$loop_count" || exec_result=$?

The || puts the function call in a conditional context, preventing set -e from firing. exec_result captures the actual exit code (0, 1, 2, or 3) so the existing if/elif/else dispatch works correctly.

Affected Versions

Tested on v0.11.5 (commit 622acb5). Likely affects all versions since set -e was added.

Related

Possibly related to #194 — some users reporting "ralph ends early" after one loop may be hitting this bug rather than the stale exit signals issue.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions