Skip to content

refactor(loop): remove set -e in favor of explicit error handling#208

Open
timothy-20 wants to merge 1 commit intofrankbria:mainfrom
timothy-20:refactor/remove-set-e
Open

refactor(loop): remove set -e in favor of explicit error handling#208
timothy-20 wants to merge 1 commit intofrankbria:mainfrom
timothy-20:refactor/remove-set-e

Conversation

@timothy-20
Copy link

@timothy-20 timothy-20 commented Feb 28, 2026

Summary

  • Remove set -e from ralph_loop.sh and replace with explicit error handling
  • Per BashFAQ/105: "don't use set -e. Add your own error checking instead."set -e is designed for linear configure && make && install scripts, not complex long-running loops with functions, conditionals, and pipelines

Changes

File What changed
ralph_loop.sh Remove set -e (L6); add || { echo "FATAL: ..."; exit 1; } guards to 5 source statements; remove set +e/set -e/set -o pipefail/set +o pipefail toggle block in live mode pipeline; separate stderr to dedicated file to prevent jq pipeline corruption (Issue #190); improve cleanup() with trap_exit_code capture and reentrancy guard; add conditional analysis failure handling with stale file cleanup; change 3 grep -c || truegrep -c || echo "0"; remove || true from reset_session and integrity check
tests/unit/test_cli_modern.bats Replace 3 errexit structural tests with 1 no-toggle verification; add 7 new tests (no set-e, source guards, cleanup trap, analysis failure, stderr separation)
CLAUDE.md Update test counts (568→573), update test_cli_modern.bats description

Why

set -e in ralph_loop.sh has accumulated significant defensive overhead:

  • || true guards: Added to prevent non-zero returns from killing the script (e.g., log_session_transition || true)
  • set +e/set -e toggle: Required around the live-mode pipeline because portable_timeout returns exit code 124 on timeout, which set -e interprets as a fatal error (Issue set -e + set -o pipefail causes silent script death on Claude timeout #175)
  • Structural tests: Tests existed solely to verify set -e workarounds were in place
  • ((expr)) risk: Arithmetic expressions return exit code 1 when the value is 0, a subtle set -e trap

Removing set -e eliminates all of this while maintaining safety through:

  1. Source guards: The only place where "fail fast" truly matters — if a library can't be loaded, the script cannot function
  2. Existing error handling: exit_code capture, circuit breaker thresholds, and explicit conditionals already handle all runtime errors
  3. New explicit handling: analyze_response failure now skips signal updates and removes stale files; cleanup() distinguishes normal vs abnormal exits

Intentionally kept || true (3 instances)

These serve functional purposes unrelated to set -e:

  • kill "$_CLAUDE_PID" 2>/dev/null || true — process may already be terminated
  • wait "$_CLAUDE_PID" 2>/dev/null || true — same
  • read -t 30 -n 1 user_choice || true — timeout is expected behavior

Test plan


🤖 Generated with Claude Code

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 28, 2026

Walkthrough

This PR enhances error handling and logging robustness in ralph_loop.sh by adding explicit error guards for source statements, dedicated stderr capture for Claude CLI invocations, conditional exit signal updates after successful analysis, and fallback defaults for brittle operations. Documentation and test coverage are updated accordingly to reflect the new behavior.

Changes

Cohort / File(s) Summary
Documentation Updates
CLAUDE.md
Updates test count from 568 to 573 tests to reflect new test additions in test_cli_modern.bats.
Error Handling & Stderr Capture
ralph_loop.sh
Adds explicit error handling for sourcing library scripts, introduces dedicated stderr capture for Claude CLI runs, stores stderr separately to prevent pipeline contamination, and implements cleanup logic in trap flow with reentrancy guards.
Exit Signal & Analysis Handling
ralph_loop.sh
Conditionally updates exit signals only after successful response analysis; logs warnings and skips updates on analysis failure. Replaces brittle fallbacks with explicit zero defaults for fix_plan.md item counts, ensuring deterministic totals computation.
Test Enhancements for New Behaviors
tests/unit/test_cli_modern.bats
Updates live-mode tests to verify stderr redirection pattern (2>"$stderr_file"), removes errexit toggle assertions, and adds new tests for: explicit error guards in sourced libs, set -e removal semantics, cleanup behavior on exit, analysis failure handling, and separate stderr file logging without stdout merging.
Command Building & CLI Guards
tests/unit/test_cli_modern.bats
Adds comprehensive tests for modern CLI behavior including: stdin redirection from /dev/null, prompt-file to -p flag transition, multiline prompt handling, shell-injection safety, CLI command guards, and format override precedence. Validates build_claude_command uses -p flag and validate_claude_command is invoked before main loop.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐰 With whiskers twitching, I trace each care,
Error guards placed with rabbit's flair,
Stderr captured, cleanly fed,
Exit signals properly spread,
Robust loops, now standing fair! 🌟

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'refactor(loop): remove set -e in favor of explicit error handling' directly and clearly summarizes the main objective of the PR—removing set -e and replacing it with explicit error handling.
Docstring Coverage ✅ Passed Docstring coverage is 84.21% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ralph_loop.sh`:
- Around line 1603-1613: The cleanup function and EXIT trap currently
unconditionally call update_status("interrupted","stopped") and overwrite
legitimate completion statuses; introduce a boolean flag (e.g., EXITED_BY_SIGNAL
or interrupted_by_signal) initialized before the main loop (next to loop_count
initialization) and change signal handlers so SIGINT/SIGTERM set that flag then
call cleanup, keep the EXIT trap for normal exits, and modify cleanup (the
cleanup function that calls reset_session and update_status) to only call
update_status("interrupted","stopped") when the flag is true (and still perform
reset_session when appropriate); ensure normal loop breaks continue to set
completion status and exit without setting the flag.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6ff27b4 and 1350058.

📒 Files selected for processing (15)
  • CLAUDE.md
  • lib/circuit_breaker.sh
  • lib/enable_core.sh
  • lib/response_analyzer.sh
  • ralph_loop.sh
  • setup.sh
  • templates/ralphrc.template
  • tests/unit/test_circuit_breaker_recovery.bats
  • tests/unit/test_cli_modern.bats
  • tests/unit/test_exit_detection.bats
  • tests/unit/test_file_protection.bats
  • tests/unit/test_integrity_check.bats
  • tests/unit/test_json_parsing.bats
  • tests/unit/test_rate_limiting.bats
  • tests/unit/test_session_continuity.bats
💤 Files with no reviewable changes (1)
  • tests/unit/test_integrity_check.bats

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
ralph_loop.sh (2)

745-745: SC2155: Separate declaration from assignment.

Same pattern as lines 576-577.

♻️ Proposed fix for SC2155
-        local incomplete_tasks=$(grep -cE "^[[:space:]]*- \[ \]" "$RALPH_DIR/fix_plan.md" 2>/dev/null || echo "0")
+        local incomplete_tasks
+        incomplete_tasks=$(grep -cE "^[[:space:]]*- \[ \]" "$RALPH_DIR/fix_plan.md" 2>/dev/null || echo "0")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ralph_loop.sh` at line 745, The SC2155 warning indicates you should not
combine declaration and command-substitution assignment; modify the block that
sets the local variable incomplete_tasks (which counts " - [ ]" in
"$RALPH_DIR/fix_plan.md") by declaring the variable with local incomplete_tasks
first, then on the next line assign it using the grep/... || echo "0" command
substitution; ensure you still redirect errors to /dev/null and preserve the
same pattern used earlier (see the similar handling around lines with the same
pattern).

576-577: SC2155: Separate declaration from assignment to avoid masking return values.

While the || echo "0" fallback handles grep failures gracefully, combining local declaration with command substitution masks the return value. This is flagged by shellcheck.

♻️ Proposed fix for SC2155
-        local uncompleted_items=$(grep -cE "^[[:space:]]*- \[ \]" "$RALPH_DIR/fix_plan.md" 2>/dev/null || echo "0")
-        local completed_items=$(grep -cE "^[[:space:]]*- \[[xX]\]" "$RALPH_DIR/fix_plan.md" 2>/dev/null || echo "0")
+        local uncompleted_items
+        uncompleted_items=$(grep -cE "^[[:space:]]*- \[ \]" "$RALPH_DIR/fix_plan.md" 2>/dev/null || echo "0")
+        local completed_items
+        completed_items=$(grep -cE "^[[:space:]]*- \[[xX]\]" "$RALPH_DIR/fix_plan.md" 2>/dev/null || echo "0")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ralph_loop.sh` around lines 576 - 577, The two variables uncompleted_items
and completed_items are declared and assigned in one statement which masks the
command substitution return value (SC2155); fix by declaring each local first
(local uncompleted_items; local completed_items) and then assign them in
separate statements using the existing command substitutions with the || echo
"0" fallback (e.g., uncompleted_items=$(grep -cE "^[[:space:]]*- \[ \]"
"$RALPH_DIR/fix_plan.md" 2>/dev/null || echo "0") and similarly for
completed_items) so the return status of the commands is preserved and
shellcheck SC2155 is resolved.
tests/unit/test_cli_modern.bats (1)

111-111: Test grep pattern differs from implementation - missing indented checkbox support.

The test uses grep -c "^- \[ \]" but ralph_loop.sh at line 745 uses grep -cE "^[[:space:]]*- \[ \]" to support indented checkboxes. This won't cause test failures with current fixtures but represents drift between test and implementation.

♻️ Proposed fix to align with implementation
         if [[ -f "$RALPH_DIR/fix_plan.md" ]]; then
-            local incomplete_tasks=$(grep -c "^- \[ \]" "$RALPH_DIR/fix_plan.md" 2>/dev/null || echo "0")
+            local incomplete_tasks=$(grep -cE "^[[:space:]]*- \[ \]" "$RALPH_DIR/fix_plan.md" 2>/dev/null || echo "0")
             context+="Remaining tasks: ${incomplete_tasks}. "
         fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/test_cli_modern.bats` at line 111, Update the test's grep pattern
to match the implementation's indented-checkbox handling: replace the current
grep -c "^- \[ \]" in the local incomplete_tasks assignment with a grep -cE
using the same regex as ralph_loop.sh ("^[[:space:]]*- \[ \]") so the test
counts both top-level and indented unchecked boxes exactly like the code under
test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@ralph_loop.sh`:
- Line 745: The SC2155 warning indicates you should not combine declaration and
command-substitution assignment; modify the block that sets the local variable
incomplete_tasks (which counts " - [ ]" in "$RALPH_DIR/fix_plan.md") by
declaring the variable with local incomplete_tasks first, then on the next line
assign it using the grep/... || echo "0" command substitution; ensure you still
redirect errors to /dev/null and preserve the same pattern used earlier (see the
similar handling around lines with the same pattern).
- Around line 576-577: The two variables uncompleted_items and completed_items
are declared and assigned in one statement which masks the command substitution
return value (SC2155); fix by declaring each local first (local
uncompleted_items; local completed_items) and then assign them in separate
statements using the existing command substitutions with the || echo "0"
fallback (e.g., uncompleted_items=$(grep -cE "^[[:space:]]*- \[ \]"
"$RALPH_DIR/fix_plan.md" 2>/dev/null || echo "0") and similarly for
completed_items) so the return status of the commands is preserved and
shellcheck SC2155 is resolved.

In `@tests/unit/test_cli_modern.bats`:
- Line 111: Update the test's grep pattern to match the implementation's
indented-checkbox handling: replace the current grep -c "^- \[ \]" in the local
incomplete_tasks assignment with a grep -cE using the same regex as
ralph_loop.sh ("^[[:space:]]*- \[ \]") so the test counts both top-level and
indented unchecked boxes exactly like the code under test.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1350058 and 8b5a662.

📒 Files selected for processing (3)
  • CLAUDE.md
  • ralph_loop.sh
  • tests/unit/test_cli_modern.bats

Replace global set -e with targeted error handling patterns:
- Add explicit error guards on all library source statements
- Replace || true with || echo "0" for consistent grep fallbacks
- Remove set +e/set -e toggles in live mode pipeline (no longer needed)
- Separate stderr to dedicated file to prevent jq pipeline corruption (frankbria#190)
- Improve cleanup() with trap_exit_code capture and reentrancy guard
- Add conditional analysis failure handling with stale file cleanup
- Remove unnecessary || true from reset_session and integrity check

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@timothy-20 timothy-20 force-pushed the refactor/remove-set-e branch from 8b5a662 to b0fc108 Compare February 28, 2026 08:33
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
ralph_loop.sh (1)

1215-1237: LGTM! Stderr separation prevents JSON pipeline corruption.

Redirecting stderr to a separate file (2>"$stderr_file") is the correct solution for Issue #190. Node.js warnings would otherwise corrupt the jq JSON stream. The cleanup logic appropriately removes empty files and logs warnings for non-empty ones.

Optional: To address SC2155, you could separate declaration and assignment:

♻️ Optional refactor for strict shellcheck compliance
-        local stderr_file="${LOG_DIR}/claude_stderr_$(date '+%Y%m%d_%H%M%S').log"
+        local stderr_file
+        stderr_file="${LOG_DIR}/claude_stderr_$(date '+%Y%m%d_%H%M%S').log"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ralph_loop.sh` around lines 1215 - 1237, The shellcheck SC2155 warning can be
resolved by separating variable declarations from command substitutions: instead
of using "local stderr_file=\"${LOG_DIR}/claude_stderr_$(date
'+%Y%m%d_%H%M%S').log\"" declare the variable first (e.g., local stderr_file)
and then assign stderr_file="$(...)" so the command substitution isn't done in
the same line as the local. Do the same for other local-with-assignment patterns
in this block (e.g., "local -a pipe_status" then pipe_status=(
"${PIPESTATUS[@]}" ), and declare exit_code with local before assigning) while
keeping the same variable names (stderr_file, portable_timeout invocation,
PIPESTATUS capture into pipe_status, exit_code, and
log_status/CLAUDE_TIMEOUT_MINUTES handling) and preserving the existing behavior
and cleanup logic.
tests/unit/test_cli_modern.bats (1)

1180-1186: Consider strengthening the set -e detection pattern.

The pattern '^set -e' only matches set -e at line start. An indented set -e (e.g., inside a function) would not be caught.

♻️ Optional: more robust pattern
-    run bash -c "grep -n '^set -e' '$script'"
+    run bash -c "grep -nE '^[[:space:]]*set -e' '$script'"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/test_cli_modern.bats` around lines 1180 - 1186, The test
"ralph_loop.sh does not use set -e" needs a stronger detection pattern: change
the grep invocation that currently searches for '^set -e' to use an
extended-regex that allows leading whitespace and flexible spacing between 'set'
and '-e' (e.g. match lines with optional indentation then 'set' then spaces then
'-e'), and ensure commented lines are excluded (filter out lines that begin with
optional whitespace followed by '#'); update the test around the script variable
'script' and the run/grep pipeline to use the extended regex and the
comment-filter so indented or spaced occurrences are caught but comments remain
ignored.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@ralph_loop.sh`:
- Around line 1215-1237: The shellcheck SC2155 warning can be resolved by
separating variable declarations from command substitutions: instead of using
"local stderr_file=\"${LOG_DIR}/claude_stderr_$(date '+%Y%m%d_%H%M%S').log\""
declare the variable first (e.g., local stderr_file) and then assign
stderr_file="$(...)" so the command substitution isn't done in the same line as
the local. Do the same for other local-with-assignment patterns in this block
(e.g., "local -a pipe_status" then pipe_status=( "${PIPESTATUS[@]}" ), and
declare exit_code with local before assigning) while keeping the same variable
names (stderr_file, portable_timeout invocation, PIPESTATUS capture into
pipe_status, exit_code, and log_status/CLAUDE_TIMEOUT_MINUTES handling) and
preserving the existing behavior and cleanup logic.

In `@tests/unit/test_cli_modern.bats`:
- Around line 1180-1186: The test "ralph_loop.sh does not use set -e" needs a
stronger detection pattern: change the grep invocation that currently searches
for '^set -e' to use an extended-regex that allows leading whitespace and
flexible spacing between 'set' and '-e' (e.g. match lines with optional
indentation then 'set' then spaces then '-e'), and ensure commented lines are
excluded (filter out lines that begin with optional whitespace followed by '#');
update the test around the script variable 'script' and the run/grep pipeline to
use the extended regex and the comment-filter so indented or spaced occurrences
are caught but comments remain ignored.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8b5a662 and b0fc108.

📒 Files selected for processing (3)
  • CLAUDE.md
  • ralph_loop.sh
  • tests/unit/test_cli_modern.bats

timothy-20 added a commit to timothy-20/ralph-claude-code that referenced this pull request Feb 28, 2026
…nkbria#208)

Bug 3 (stderr separation: 2>&1 → 2>"$stderr_file") is already
implemented in PR frankbria#208 (refactor/remove-set-e). Removing it from
this PR eliminates the merge conflict between frankbria#202 and frankbria#208 in
the ralph_loop.sh pipeline area and prevents duplicate changes.

Changes:
- ralph_loop.sh: revert pipeline to 2>&1, remove stderr_file var
  and stderr logging block
- test_cli_modern.bats: revert grep patterns to 2>&1, remove 3
  stderr separation tests
- CLAUDE.md: remove stderr separation paragraph and test description

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant