fix(loop): Foundry credential isolation, model override, and clean Ctrl+C shutdown by tudoanh · Pull Request #186 · frankbria/ralph-claude-code

tudoanh · 2026-02-18T10:51:22Z

Summary

Foundry credential isolation: Prevent Azure Foundry credentials (ANTHROPIC_FOUNDRY_*) from leaking across concurrent Ralph tmux sessions. When CLAUDE_CODE_USE_FOUNDRY != 1, actively unset Foundry vars at both the tmux session level
(forward_tmux_env) and the process environment level (personal account code path).
CLAUDE_MODEL support: Add CLAUDE_MODEL config in .ralphrc that gets passed as --model to the Claude Code CLI, allowing users to override the default model (e.g., CLAUDE_MODEL=opus).
Clean Ctrl+C shutdown: Fix orphaned Claude processes when pressing Ctrl+C in --live mode. Track child PIDs via CLAUDE_CHILD_PID and kill the entire process tree in cleanup().
Subshell pipeline (improves on set -e + set -o pipefail causes silent script death on Claude timeout #175): Wrap the live-mode streaming pipeline in a backgrounded subshell (( ... ) & + wait). This naturally isolates the pipeline's exit code from set -e — a cleaner approach than the set +e/set -e toggle,
which is brittle and the root cause of bugs like bug: set -e causes script termination on API limit prompt timeout (read -t 30) #185.

Context

When running two Ralph projects simultaneously in tmux (e.g., one company project with Azure Foundry, one personal with Claude Max), the Foundry credentials leak from the first session into the second because tmux captures the parent shell's full
environment as the global tmux environment.

The set -e approach for handling pipeline exit codes (Issue #175) is fragile — any new code path that returns non-zero (like read -t 30 in #185) silently kills the script. The subshell approach solves both the errexit problem and the Ctrl+C orphan
problem in one pattern.

Changes

ralph_loop.sh: Foundry env isolation, CLAUDE_MODEL passthrough, CLAUDE_CHILD_PID cleanup, subshell pipeline
tests/unit/test_cli_modern.bats: Updated 2 tests to match subshell approach (replaces set +e/set -e assertions)

Test plan

All 548 tests pass (547 pass, 1 pre-existing failure in test_integrity_check.bats unrelated to this PR — wait_for_reset's while true appears before the integrity check)
Start Ralph for Foundry project — verify it uses Azure Foundry credentials
Start Ralph for personal project concurrently — verify no Foundry credential leakage
Set CLAUDE_MODEL=opus in .ralphrc — verify --model opus appears in CLI args
Press Ctrl+C during --live mode — verify no orphaned Claude processes

Closes #185 (related — subshell approach eliminates the class of set -e bugs)

Summary by CodeRabbit

New Features
- Specify Claude model per session and enable Chrome integration toggle.
- Forward configurable tmux environment entries (including wildcards) into sessions.
- Foundry vs personal-account credential selection with detection and export/unset behavior.
- Session resume support and interactive handling for long (5‑hour) API limits.
Bug Fixes
- Improved Claude process tracking and subtree cleanup.
- Live-streaming runs in isolated background subshells for robust timeout and cleanup.
- Circuit-breaker now polls for recovery instead of immediate aborts; status/log improvements.
Tests
- Updated tests for subshell/background execution, wait-based exit handling, and timeout logging.

macroscopeapp · 2026-02-18T10:52:36Z

Add Foundry credential isolation, model override, Chrome flag, tmux env forwarding, circuit-breaker recovery wait, and reliable Ctrl+C shutdown to `ralph_loop.sh`

ralph_loop.sh adds Foundry/Azure Anthropic env handling with isolation, model and --chrome overrides in build_claude_command, tmux env forwarding via forward_tmux_env, loop auto-restart and flag propagation in setup_tmux_session, circuit-breaker recovery via wait_for_circuit_recovery, PID-tracked execution and cleanup in execute_claude_code with CLAUDE_CHILD_PID, and hard kill of child processes in cleanup. Tests update in tests/unit/test_cli_modern.bats.

📍Where to Start

Start with the CLI flow in main and execute_claude_code in ralph_loop.sh, then review forward_tmux_env, setup_tmux_session, and wait_for_circuit_recovery for control flow and env handling.

Macroscope summarized a0cb953.

macroscopeapp · 2026-02-18T10:55:31Z

ralph_loop.sh

+    if [[ -n "$CLAUDE_CHILD_PID" ]]; then
+        # Kill the entire process group rooted at CLAUDE_CHILD_PID
+        # This catches: timeout -> stdbuf -> claude (and pipeline peers in live mode)
+        kill -- -"$CLAUDE_CHILD_PID" 2>/dev/null || true


🟠 High

ralph_loop.sh:1579 kill -- -"$CLAUDE_CHILD_PID" assumes the PID equals a PGID, but backgrounded subshells in non-interactive scripts inherit the parent's process group. This silently fails (ESRCH) or kills unrelated processes. Consider using pkill -P to kill by parent PID, or setsid when spawning to create a new session/process group.

🚀 Want me to fix this? Reply ex: "fix it for me".

🤖 Prompt for AI

In file ralph_loop.sh around line 1579: `kill -- -"$CLAUDE_CHILD_PID"` assumes the PID equals a PGID, but backgrounded subshells in non-interactive scripts inherit the parent's process group. This silently fails (ESRCH) or kills unrelated processes. Consider using `pkill -P` to kill by parent PID, or `setsid` when spawning to create a new session/process group.

macroscopeapp · 2026-02-18T10:55:31Z

ralph_loop.sh

+
+    [[ -z "$patterns" ]] && return 0
+
+    for pattern in $patterns; do


🟡 Medium

ralph_loop.sh:203 Unquoted $patterns causes glob expansion—if files match ANTHROPIC_*, they replace the pattern. Consider wrapping the loop with set -f / set +f to disable globbing.

🚀 Want me to fix this? Reply ex: "fix it for me".

🤖 Prompt for AI

In file ralph_loop.sh around line 203: Unquoted `$patterns` causes glob expansion—if files match `ANTHROPIC_*`, they replace the pattern. Consider wrapping the loop with `set -f` / `set +f` to disable globbing.

macroscopeapp · 2026-02-18T10:55:31Z

ralph_loop.sh

    [[ -n "$_env_CB_COOLDOWN_MINUTES" ]] && CB_COOLDOWN_MINUTES="$_env_CB_COOLDOWN_MINUTES"
    [[ -n "$_env_CB_AUTO_RESET" ]] && CB_AUTO_RESET="$_env_CB_AUTO_RESET"
+    [[ -n "$_env_RALPH_TMUX_ENV" ]] && RALPH_TMUX_ENV="$_env_RALPH_TMUX_ENV"
+    [[ -n "$_env_CLAUDE_MODEL" ]] && CLAUDE_MODEL="$_env_CLAUDE_MODEL"


🟡 Medium

ralph_loop.sh:177 Missing restoration of _env_CLAUDE_CODE_USE_FOUNDRY and _env_ANTHROPIC_* variables. These are captured earlier but not restored here, so .ralphrc values can override user environment settings, violating the documented precedence. Consider adding restoration lines for these Foundry-related variables.

- [[ -n "$_env_CLAUDE_MODEL" ]] && CLAUDE_MODEL="$_env_CLAUDE_MODEL" + [[ -n "$_env_CLAUDE_MODEL" ]] && CLAUDE_MODEL="$_env_CLAUDE_MODEL" + [[ -n "$_env_CLAUDE_CODE_USE_FOUNDRY" ]] && CLAUDE_CODE_USE_FOUNDRY="$_env_CLAUDE_CODE_USE_FOUNDRY" + [[ -n "$_env_ANTHROPIC_FOUNDRY_RESOURCE" ]] && ANTHROPIC_FOUNDRY_RESOURCE="$_env_ANTHROPIC_FOUNDRY_RESOURCE" + [[ -n "$_env_ANTHROPIC_FOUNDRY_API_KEY" ]] && ANTHROPIC_FOUNDRY_API_KEY="$_env_ANTHROPIC_FOUNDRY_API_KEY"

🚀 Want me to fix this? Reply ex: "fix it for me".

🤖 Prompt for AI

In file ralph_loop.sh around line 177: Missing restoration of `_env_CLAUDE_CODE_USE_FOUNDRY` and `_env_ANTHROPIC_*` variables. These are captured earlier but not restored here, so `.ralphrc` values can override user environment settings, violating the documented precedence. Consider adding restoration lines for these Foundry-related variables.

coderabbitai · 2026-02-18T11:07:03Z

Walkthrough

Adds tmux environment forwarding, a circuit-breaker cooldown polling flow, Foundry/Anthropic model/auth wiring, subshell-based live Claude execution with CLAUDE_CHILD_PID tracking and improved cleanup, plus corresponding unit-test updates to assert background/subshell and wait-based exit handling.

Changes

Cohort / File(s)	Summary
Core CLI / Runtime `ralph_loop.sh`	Adds `forward_tmux_env()` and integrates into session setup; adds `wait_for_circuit_recovery()` and replaces immediate OPEN aborts with polling; wires `CLAUDE_MODEL`, `CLAUDE_CHROME`, `CLAUDE_CODE_USE_FOUNDRY`, and `ANTHROPIC_*` envs; appends `--model`/`--chrome` to Claude CLI when set; implements session resume by `session_id`. Tracks Claude process with `CLAUDE_CHILD_PID`, runs live streaming inside a backgrounded subshell `() &`, captures exit via `wait`, and expands cleanup to kill Claude process subtrees and unset Foundry creds when appropriate.
Unit Tests (modern CLI) `tests/unit/test_cli_modern.bats`	Updates live-mode tests to assert the live pipeline runs inside a backgrounded subshell and captures `CLAUDE_CHILD_PID`, verifies exit capture via `wait` and clearing of `CLAUDE_CHILD_PID`, and adds timeout/logging and stdin-redirection-related assertions replacing previous errexit/PIPESTATUS ordering checks.

Sequence Diagram(s)

sequenceDiagram
    participant Main as ralph_loop
    participant CB as CircuitBreaker
    participant Recovery as wait_for_circuit_recovery
    participant ClaudeCLI as Claude CLI

    Main->>CB: query circuit state
    CB-->>Main: OPEN
    Main->>Recovery: start polling loop
    Recovery->>CB: poll state (repeat)
    CB-->>Recovery: OPEN (repeat)
    CB-->>Recovery: HALF_OPEN/CLOSED
    Recovery-->>Main: recovered
    Main->>ClaudeCLI: resume/execute request (may include --model/--chrome)
    ClaudeCLI-->>Main: stream output / exit

sequenceDiagram
    participant Main as ralph_loop
    participant Builder as build_claude_command()
    participant Subshell as backgrounded subshell () &
    participant Claude as Claude CLI
    participant Reaper as cleanup / wait

    Main->>Builder: build command (includes CLAUDE_MODEL/CLAUDE_CHROME if set)
    Main->>Subshell: launch subshell with command
    Subshell->>Claude: exec Claude CLI (child processes)
    Subshell-->>Main: export CLAUDE_CHILD_PID (background)
    Main->>Reaper: wait $CLAUDE_CHILD_PID
    Reaper-->>Main: exit_code=$?
    Main->>Reaper: cleanup process tree using CLAUDE_CHILD_PID

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat(circuit-breaker): add auto-recovery from OPEN state #165 — Adds auto-recovery/open→half-open cooldown and opened_at handling; strongly related to the new wait_for_circuit_recovery behavior.
fix(live-mode): crash with text output format (#164) #166 — Modifies live-mode streaming and command construction; overlaps with the subshell/background execution and CLAUDE_CHILD_PID capture changes.
fix(loop): replace non-existent --prompt-file with -p flag #58 — Changes build_claude_command argument construction; directly related to the added --model/--chrome wiring and CLI path fallbacks.

Poem

🐰
I hop through envs and gently bind,
I wait when circuits lose their mind,
I launch in shells that hum and sigh,
I track the pids as time goes by,
A rabbit cheers: the loop runs fine!

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title clearly and concisely summarizes the three main changes: Foundry credential isolation, model override support, and clean Ctrl+C shutdown, directly reflecting the core objectives.
Linked Issues check	✅ Passed	The PR fully addresses the linked issue `#185` by replacing brittle set +e/set -e toggles with a subshell pipeline approach that robustly handles read timeouts, and the updated tests validate this behavioral change.
Out of Scope Changes check	✅ Passed	All changes directly support the stated objectives: tmux environment forwarding isolates Foundry credentials, CLAUDE_MODEL enables model override, CLAUDE_CHILD_PID tracks processes for clean shutdown, and subshell pipeline fixes set -e fragility.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

ralph_loop.sh (1)
1800-1804: ⚠️ Potential issue | 🟠 Major

Issue #185 (read -t 30 + set -e) appears unfixed here.

Line 1801 read -t 30 -n 1 user_choice returns non-zero on timeout, which under set -e exits the script — exactly the bug described in Issue #185. This path is reached when the API 5-hour limit is hit (exec_result -eq 2). While this line isn't changed in this PR, the PR references Issue #185 as the motivating problem.
Suggested fix
-            read -t 30 -n 1 user_choice
+            read -t 30 -n 1 user_choice || true
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ralph_loop.sh` around lines 1800 - 1804, The read with timeout (`read -t 30
-n 1 user_choice`) can return non-zero on timeout and cause the script to exit
under set -e; change the call so its non-zero exit doesn't abort the script (for
example, wrap it to ignore failures or capture its status), e.g. perform the
timed read in a conditional and set user_choice="" on timeout or use `|| true`
after the read so subsequent logic that checks user_choice (the existing if [[
"$user_choice" == "2" ]] || [[ -z "$user_choice" ]]) continues to run; update
the read invocation near where user_choice is used to ensure timeouts don't
trigger script termination.

🧹 Nitpick comments (4)

ralph_loop.sh (3)
1404-1406: LGTM — CLAUDE_CHILD_PID tracking in background mode enables Ctrl+C cleanup.

Note: the wait $claude_pid on line 1451 has the same set -e exposure as the live-mode path flagged above. Consider applying the same wait $claude_pid || exit_code=$? fix here as well for consistency, though this pattern may have been pre-existing.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ralph_loop.sh` around lines 1404 - 1406, The background-mode path captures
the child PID into claude_pid and CLAUDE_CHILD_PID but still performs a bare
wait "wait $claude_pid" which will trigger set -e on failure; change the wait
logic used for the live-mode path to a safe form here as well (e.g., replace the
direct wait with "wait $claude_pid || exit_code=$?" style handling and
propagate/handle exit_code consistently), updating any downstream cleanup or
exit logic that inspects exit_code so background-mode exit handling mirrors
live-mode; reference variables/functions: claude_pid, CLAUDE_CHILD_PID, wait
$claude_pid, and exit_code.
1707-1709: Behavioral change: circuit breaker OPEN now blocks indefinitely instead of exiting.

Previously the loop would break on circuit breaker OPEN. Now it calls wait_for_circuit_recovery (unbounded) and continue. This is intentional per the PR but worth confirming this is the desired UX — operators may prefer a configurable max-wait or an exit after N failed recoveries.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ralph_loop.sh` around lines 1707 - 1709, The change makes the circuit-breaker
OPEN path call wait_for_circuit_recovery("$loop_count") and continue, which can
block indefinitely; change this to honor a configurable max wait or max recovery
attempts by adding a check around the log_status/wait_for_circuit_recovery block
(referencing log_status, wait_for_circuit_recovery, and loop_count) so that if a
new config variable (e.g., MAX_RECOVERY_ATTEMPTS or MAX_RECOVERY_WAIT) is
exceeded you log a final error and break/exit instead of continuing; implement
incrementing/tracking of recovery attempts (or elapsed wait) and use that to
decide between calling wait_for_circuit_recovery and breaking out of the loop.
1465-1474: Non-portable \| alternation in grep — use grep -E for reliability.

Line 1468 uses \| for alternation in basic regex, which is a GNU extension not supported by BSD grep (macOS). Since this is informational logging, the impact is low, but it's trivially fixable.
Proposed fix
-            if grep -q '"usage"\|"billing"\|"credits_remaining"' "$output_file" 2>/dev/null; then
+            if grep -qE '"usage"|"billing"|"credits_remaining"' "$output_file" 2>/dev/null; then
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ralph_loop.sh` around lines 1465 - 1474, The grep alternation using '\|' is
non-portable; update the check that reads "$output_file" (the grep command that
currently tests for '"usage"\|"billing"\|"credits_remaining"') to use extended
regex mode (e.g., grep -E or grep -P if available) with a single quoted
alternation like '("usage"|"billing"|"credits_remaining")', keep the -q and
2>/dev/null redirect, and leave the surrounding conditional logic unchanged (the
block referencing output_file, CLAUDE_CODE_USE_FOUNDRY, and log_status should
remain the same).
tests/unit/test_cli_modern.bats (1)
838-869: Add unit tests for forward_tmux_env() and wait_for_circuit_recovery() functions.

Both functions are in ralph_loop.sh but have no corresponding tests in this file. forward_tmux_env() handles glob pattern matching, prefix expansion, and conditional Foundry credential unsetting. wait_for_circuit_recovery() implements a polling loop with state checking and timestamp calculations. Both warrant unit test coverage per coding guidelines.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/test_cli_modern.bats` around lines 838 - 869, Add unit tests
covering forward_tmux_env() and wait_for_circuit_recovery(): for
forward_tmux_env() write cases that verify glob pattern matching and prefix
expansion produce expected environment variable exports and that when a variable
name matches the Foundry credential pattern it unsets
FOUNDRY_USERNAME/FOUNDRY_PASSWORD appropriately (including negative cases); for
wait_for_circuit_recovery() write tests that stub the dependencies/state checker
to simulate state transitions, verify the polling loop calls sleep as expected,
that timestamp/calculation logic yields correct backoff/timeout behavior, and
that success and timeout exit conditions set/return the correct codes—place
these tests alongside the existing live-mode tests and use the same sed/grep
pattern extraction approach to invoke and assert on the ralph_loop.sh functions.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ralph_loop.sh`:
- Around line 1207-1209: The current code logs the first 10 characters of
ANTHROPIC_FOUNDRY_API_KEY via the log_status call, exposing sensitive secret
material; change the behavior in the block that checks ANTHROPIC_FOUNDRY_API_KEY
so it does not emit partial keys to $LOG_DIR/ralph.log — instead log only
presence/absence (e.g., "ANTHROPIC_FOUNDRY_API_KEY: set" or "not set") or, if
you must show a hint, mask to a minimal safe slice such as a single character or
last 4 chars; update the log_status invocation that references
ANTHROPIC_FOUNDRY_API_KEY accordingly to avoid writing key material to logs.
- Around line 50-61: The comment for the Foundry flag is misleading: it says to
set CLAUDE_CODE_USE_FOUNDRY=true but the script defaults
CLAUDE_CODE_USE_FOUNDRY="${CLAUDE_CODE_USE_FOUNDRY:-0}" and later checks for ==
"1"; update the comment to instruct users to set CLAUDE_CODE_USE_FOUNDRY=1 (or
alternatively modify the runtime check to accept "true" as well) so the behavior
of CLAUDE_CODE_USE_FOUNDRY, its default, and the check (referenced by
CLAUDE_CODE_USE_FOUNDRY and the runtime comparison around the Foundry check) are
consistent.
- Around line 509-543: The wait_for_circuit_recovery function lacks an upper
bound and documentation; update it to accept or use a MAX_WAIT_MINUTES (or
MAX_ITERATIONS) constant and enforce a maximum loop duration/iteration count
inside wait_for_circuit_recovery so that if get_circuit_state remains
"$CB_STATE_OPEN" beyond the limit it logs an error, updates status (e.g.,
"circuit_breaker_failed" or "timeout"), and returns a non-zero value; also add a
bash function comment block above wait_for_circuit_recovery describing purpose,
parameters (loop_count, optional max wait), and return values (0 on recovered,
non-zero on timeout/error), and reference functions/vars like
init_circuit_breaker, get_circuit_state, CB_COOLDOWN_MINUTES, CB_STATE_FILE,
parse_iso_to_epoch, and CALL_COUNT_FILE when implementing the timeout handling
and status updates.
- Around line 200-242: The function forward_tmux_env disables globbing with set
-f but can exit early (e.g., if tmux set-environment fails) leaving globbing
disabled; ensure set +f always runs by scoping noglob in a subshell or
installing a short-lived trap: either wrap the pattern-processing loop in a
subshell so set -f is local to that subshell, or add a trap (e.g., trap 'set +f'
RETURN or EXIT) immediately before set -f and clear it after set +f; apply this
to the forward_tmux_env function and ensure every tmux set-environment
invocation (the tmux set-environment calls) either is protected with || true or
is covered by the trap/subshell so set +f is restored on all exit paths.
- Around line 1319-1332: The wait on CLAUDE_CHILD_PID can trigger set -e when
the child exits non‑zero (e.g., timeout 124), so change the wait invocation to
capture the child's exit status without causing an immediate script exit: run
wait $CLAUDE_CHILD_PID with a fallthrough that assigns exit_code (e.g., use
"wait $CLAUDE_CHILD_PID || exit_code=$?") so the script records the actual exit
code for downstream timeout/session/circuit-breaker handling (references:
CLAUDE_CHILD_PID, wait, exit_code, portable_timeout).

---

Outside diff comments:
In `@ralph_loop.sh`:
- Around line 1800-1804: The read with timeout (`read -t 30 -n 1 user_choice`)
can return non-zero on timeout and cause the script to exit under set -e; change
the call so its non-zero exit doesn't abort the script (for example, wrap it to
ignore failures or capture its status), e.g. perform the timed read in a
conditional and set user_choice="" on timeout or use `|| true` after the read so
subsequent logic that checks user_choice (the existing if [[ "$user_choice" ==
"2" ]] || [[ -z "$user_choice" ]]) continues to run; update the read invocation
near where user_choice is used to ensure timeouts don't trigger script
termination.

---

Nitpick comments:
In `@ralph_loop.sh`:
- Around line 1404-1406: The background-mode path captures the child PID into
claude_pid and CLAUDE_CHILD_PID but still performs a bare wait "wait
$claude_pid" which will trigger set -e on failure; change the wait logic used
for the live-mode path to a safe form here as well (e.g., replace the direct
wait with "wait $claude_pid || exit_code=$?" style handling and propagate/handle
exit_code consistently), updating any downstream cleanup or exit logic that
inspects exit_code so background-mode exit handling mirrors live-mode; reference
variables/functions: claude_pid, CLAUDE_CHILD_PID, wait $claude_pid, and
exit_code.
- Around line 1707-1709: The change makes the circuit-breaker OPEN path call
wait_for_circuit_recovery("$loop_count") and continue, which can block
indefinitely; change this to honor a configurable max wait or max recovery
attempts by adding a check around the log_status/wait_for_circuit_recovery block
(referencing log_status, wait_for_circuit_recovery, and loop_count) so that if a
new config variable (e.g., MAX_RECOVERY_ATTEMPTS or MAX_RECOVERY_WAIT) is
exceeded you log a final error and break/exit instead of continuing; implement
incrementing/tracking of recovery attempts (or elapsed wait) and use that to
decide between calling wait_for_circuit_recovery and breaking out of the loop.
- Around line 1465-1474: The grep alternation using '\|' is non-portable; update
the check that reads "$output_file" (the grep command that currently tests for
'"usage"\|"billing"\|"credits_remaining"') to use extended regex mode (e.g.,
grep -E or grep -P if available) with a single quoted alternation like
'("usage"|"billing"|"credits_remaining")', keep the -q and 2>/dev/null redirect,
and leave the surrounding conditional logic unchanged (the block referencing
output_file, CLAUDE_CODE_USE_FOUNDRY, and log_status should remain the same).

In `@tests/unit/test_cli_modern.bats`:
- Around line 838-869: Add unit tests covering forward_tmux_env() and
wait_for_circuit_recovery(): for forward_tmux_env() write cases that verify glob
pattern matching and prefix expansion produce expected environment variable
exports and that when a variable name matches the Foundry credential pattern it
unsets FOUNDRY_USERNAME/FOUNDRY_PASSWORD appropriately (including negative
cases); for wait_for_circuit_recovery() write tests that stub the
dependencies/state checker to simulate state transitions, verify the polling
loop calls sleep as expected, that timestamp/calculation logic yields correct
backoff/timeout behavior, and that success and timeout exit conditions
set/return the correct codes—place these tests alongside the existing live-mode
tests and use the same sed/grep pattern extraction approach to invoke and assert
on the ralph_loop.sh functions.

coderabbitai · 2026-02-18T11:13:45Z

ralph_loop.sh

+# Azure Foundry / Anthropic API configuration
+# Set CLAUDE_CODE_USE_FOUNDRY=true in .ralphrc to use Azure Foundry for this project
+_env_CLAUDE_CODE_USE_FOUNDRY="${CLAUDE_CODE_USE_FOUNDRY:-}"
+_env_ANTHROPIC_FOUNDRY_RESOURCE="${ANTHROPIC_FOUNDRY_RESOURCE:-}"
+_env_ANTHROPIC_FOUNDRY_API_KEY="${ANTHROPIC_FOUNDRY_API_KEY:-}"
+_env_ANTHROPIC_DEFAULT_SONNET_MODEL="${ANTHROPIC_DEFAULT_SONNET_MODEL:-}"
+_env_ANTHROPIC_DEFAULT_HAIKU_MODEL="${ANTHROPIC_DEFAULT_HAIKU_MODEL:-}"
+_env_ANTHROPIC_DEFAULT_OPUS_MODEL="${ANTHROPIC_DEFAULT_OPUS_MODEL:-}"
+_env_CLAUDE_MODEL="${CLAUDE_MODEL:-}"
+
+# Default: don't use Foundry (use personal account)
+CLAUDE_CODE_USE_FOUNDRY="${CLAUDE_CODE_USE_FOUNDRY:-0}"


⚠️ Potential issue | 🟡 Minor

Inconsistent documentation: comment says true but code expects 1.

Line 51 says "Set CLAUDE_CODE_USE_FOUNDRY=true in .ralphrc" but line 61 defaults to "0" and the runtime check on line 1201 compares == "1". A user following the comment would set true, which would not match "1".

Proposed fix

# Azure Foundry / Anthropic API configuration -# Set CLAUDE_CODE_USE_FOUNDRY=true in .ralphrc to use Azure Foundry for this project +# Set CLAUDE_CODE_USE_FOUNDRY=1 in .ralphrc to use Azure Foundry for this project

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@ralph_loop.sh` around lines 50 - 61, The comment for the Foundry flag is misleading: it says to set CLAUDE_CODE_USE_FOUNDRY=true but the script defaults CLAUDE_CODE_USE_FOUNDRY="${CLAUDE_CODE_USE_FOUNDRY:-0}" and later checks for == "1"; update the comment to instruct users to set CLAUDE_CODE_USE_FOUNDRY=1 (or alternatively modify the runtime check to accept "true" as well) so the behavior of CLAUDE_CODE_USE_FOUNDRY, its default, and the check (referenced by CLAUDE_CODE_USE_FOUNDRY and the runtime comparison around the Foundry check) are consistent.

coderabbitai · 2026-02-18T11:13:46Z

ralph_loop.sh

+# Forward environment variables into a tmux session
+# Uses RALPH_TMUX_ENV to determine which vars to forward
+# Supports exact names (e.g., "FOO") and prefix patterns (e.g., "ANTHROPIC_*")
+forward_tmux_env() {
+    local session_name="$1"
+    local patterns="$RALPH_TMUX_ENV"
+
+    [[ -z "$patterns" ]] && return 0
+
+    # Disable globbing so patterns like "ANTHROPIC_*" are not expanded
+    # against filenames in the current directory
+    set -f
+    for pattern in $patterns; do
+        if [[ "$pattern" == *'*' ]]; then
+            # Prefix pattern: match all env vars starting with the prefix
+            local prefix="${pattern%\*}"
+            while IFS= read -r line; do
+                local name="${line%%=*}"
+                local value="${line#*=}"
+                if [[ -n "$name" && -n "$value" ]]; then
+                    tmux set-environment -t "$session_name" "$name" "$value"
+                fi
+            done < <(env | grep "^${prefix}" || true)
+        else
+            # Exact match: forward if set in current environment
+            local value="${!pattern:-}"
+            if [[ -n "$value" ]]; then
+                tmux set-environment -t "$session_name" "$pattern" "$value"
+            fi
+        fi
+    done
+    set +f
+
+    # When Foundry is disabled, actively unset Foundry vars at session level
+    # to prevent leakage from tmux global environment (concurrent session isolation)
+    if [[ "${CLAUDE_CODE_USE_FOUNDRY:-0}" != "1" ]]; then
+        tmux set-environment -t "$session_name" -u ANTHROPIC_FOUNDRY_RESOURCE 2>/dev/null || true
+        tmux set-environment -t "$session_name" -u ANTHROPIC_FOUNDRY_API_KEY 2>/dev/null || true
+        tmux set-environment -t "$session_name" -u ANTHROPIC_DEFAULT_SONNET_MODEL 2>/dev/null || true
+        tmux set-environment -t "$session_name" -u ANTHROPIC_DEFAULT_HAIKU_MODEL 2>/dev/null || true
+        tmux set-environment -t "$session_name" -u ANTHROPIC_DEFAULT_OPUS_MODEL 2>/dev/null || true
+    fi
+}


⚠️ Potential issue | 🟡 Minor

set -f is not restored if an early exit occurs between Lines 211 and 231.

Under set -e, if tmux set-environment (lines 220, 227) fails, the script exits with globbing still disabled. Consider using a trap or restructuring to ensure set +f runs on all exit paths.

Proposed fix — use a subshell to scope noglob

- set -f - for pattern in $patterns; do + # Run in subshell so set -f (noglob) is automatically scoped + ( + set -f + for pattern in $patterns; do if [[ "$pattern" == *'*' ]]; then ... fi done - set +f + )

Alternatively, add || true to the tmux set-environment calls on lines 220 and 227 to match the defensive style already used on lines 236–240.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Forward environment variables into a tmux session

# Uses RALPH_TMUX_ENV to determine which vars to forward

# Supports exact names (e.g., "FOO") and prefix patterns (e.g., "ANTHROPIC_*")

forward_tmux_env() {

local session_name="$1"

local patterns="$RALPH_TMUX_ENV"

[[ -z "$patterns" ]] && return 0

# Disable globbing so patterns like "ANTHROPIC_*" are not expanded

# against filenames in the current directory

set -f

for pattern in $patterns; do

if [[ "$pattern" == *'*' ]]; then

# Prefix pattern: match all env vars starting with the prefix

local prefix="${pattern%\*}"

while IFS= read -r line; do

local name="${line%%=*}"

local value="${line#*=}"

if [[ -n "$name" && -n "$value" ]]; then

tmux set-environment -t "$session_name" "$name" "$value"

fi

done < <(env | grep "^${prefix}" || true)

else

# Exact match: forward if set in current environment

local value="${!pattern:-}"

if [[ -n "$value" ]]; then

tmux set-environment -t "$session_name" "$pattern" "$value"

fi

fi

done

set +f

# When Foundry is disabled, actively unset Foundry vars at session level

# to prevent leakage from tmux global environment (concurrent session isolation)

if [[ "${CLAUDE_CODE_USE_FOUNDRY:-0}" != "1" ]]; then

tmux set-environment -t "$session_name" -u ANTHROPIC_FOUNDRY_RESOURCE 2>/dev/null || true

tmux set-environment -t "$session_name" -u ANTHROPIC_FOUNDRY_API_KEY 2>/dev/null || true

tmux set-environment -t "$session_name" -u ANTHROPIC_DEFAULT_SONNET_MODEL 2>/dev/null || true

tmux set-environment -t "$session_name" -u ANTHROPIC_DEFAULT_HAIKU_MODEL 2>/dev/null || true

tmux set-environment -t "$session_name" -u ANTHROPIC_DEFAULT_OPUS_MODEL 2>/dev/null || true

fi

}

# Forward environment variables into a tmux session

# Uses RALPH_TMUX_ENV to determine which vars to forward

# Supports exact names (e.g., "FOO") and prefix patterns (e.g., "ANTHROPIC_*")

forward_tmux_env() {

local session_name="$1"

local patterns="$RALPH_TMUX_ENV"

[[ -z "$patterns" ]] && return 0

# Run in subshell so set -f (noglob) is automatically scoped

(

set -f

for pattern in $patterns; do

if [[ "$pattern" == *'*' ]]; then

# Prefix pattern: match all env vars starting with the prefix

local prefix="${pattern%\*}"

while IFS= read -r line; do

local name="${line%%=*}"

local value="${line#*=}"

if [[ -n "$name" && -n "$value" ]]; then

tmux set-environment -t "$session_name" "$name" "$value"

fi

done < <(env | grep "^${prefix}" || true)

else

# Exact match: forward if set in current environment

local value="${!pattern:-}"

if [[ -n "$value" ]]; then

tmux set-environment -t "$session_name" "$pattern" "$value"

fi

fi

done

)

# When Foundry is disabled, actively unset Foundry vars at session level

# to prevent leakage from tmux global environment (concurrent session isolation)

if [[ "${CLAUDE_CODE_USE_FOUNDRY:-0}" != "1" ]]; then

tmux set-environment -t "$session_name" -u ANTHROPIC_FOUNDRY_RESOURCE 2>/dev/null || true

tmux set-environment -t "$session_name" -u ANTHROPIC_FOUNDRY_API_KEY 2>/dev/null || true

tmux set-environment -t "$session_name" -u ANTHROPIC_DEFAULT_SONNET_MODEL 2>/dev/null || true

tmux set-environment -t "$session_name" -u ANTHROPIC_DEFAULT_HAIKU_MODEL 2>/dev/null || true

tmux set-environment -t "$session_name" -u ANTHROPIC_DEFAULT_OPUS_MODEL 2>/dev/null || true

fi

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@ralph_loop.sh` around lines 200 - 242, The function forward_tmux_env disables globbing with set -f but can exit early (e.g., if tmux set-environment fails) leaving globbing disabled; ensure set +f always runs by scoping noglob in a subshell or installing a short-lived trap: either wrap the pattern-processing loop in a subshell so set -f is local to that subshell, or add a trap (e.g., trap 'set +f' RETURN or EXIT) immediately before set -f and clear it after set +f; apply this to the forward_tmux_env function and ensure every tmux set-environment invocation (the tmux set-environment calls) either is protected with || true or is covered by the trap/subshell so set +f is restored on all exit paths.

coderabbitai · 2026-02-18T11:13:46Z

ralph_loop.sh

+# Wait for circuit breaker cooldown recovery instead of exiting
+wait_for_circuit_recovery() {
+    local loop_count=$1
+    local check_interval=60  # Check every 60 seconds
+
+    log_status "WARN" "Circuit breaker OPEN - waiting for cooldown recovery (${CB_COOLDOWN_MINUTES}m)..."
+    update_status "$loop_count" "$(cat "$CALL_COUNT_FILE" 2>/dev/null || echo "0")" "waiting_for_recovery" "paused" "circuit_breaker_cooldown"
+
+    while true; do
+        sleep "$check_interval"
+
+        # Re-init circuit breaker (handles OPEN→HALF_OPEN/CLOSED transitions)
+        init_circuit_breaker
+
+        local state
+        state=$(get_circuit_state)
+        if [[ "$state" != "$CB_STATE_OPEN" ]]; then
+            log_status "SUCCESS" "Circuit breaker recovered to $state - resuming loop"
+            return 0
+        fi
+
+        # Show remaining cooldown time
+        local opened_at
+        opened_at=$(jq -r '.opened_at // .last_change // ""' "$CB_STATE_FILE" 2>/dev/null)
+        if [[ -n "$opened_at" && "$opened_at" != "null" ]]; then
+            local opened_epoch elapsed_minutes remaining
+            opened_epoch=$(parse_iso_to_epoch "$opened_at")
+            elapsed_minutes=$(( ($(date +%s) - opened_epoch) / 60 ))
+            remaining=$((CB_COOLDOWN_MINUTES - elapsed_minutes))
+            if [[ $remaining -gt 0 ]]; then
+                log_status "INFO" "Cooldown: ${elapsed_minutes}m elapsed, ~${remaining}m remaining"
+            fi
+        fi
+    done
+}


⚠️ Potential issue | 🟠 Major

wait_for_circuit_recovery has no upper bound — a corrupted state file will loop forever.

If get_circuit_state always returns OPEN (e.g., corrupted state file, clock skew preventing cooldown expiry), this function never returns. Consider adding a maximum wait time or iteration count as a safety net.

Proposed safeguard

wait_for_circuit_recovery() { local loop_count=$1 local check_interval=60 # Check every 60 seconds + local max_wait_seconds=$(( (CB_COOLDOWN_MINUTES + 10) * 60 )) # cooldown + 10min buffer + local waited=0 log_status "WARN" "Circuit breaker OPEN - waiting for cooldown recovery (${CB_COOLDOWN_MINUTES}m)..." ... while true; do sleep "$check_interval" + waited=$((waited + check_interval)) + + if [[ $waited -ge $max_wait_seconds ]]; then + log_status "ERROR" "Recovery wait exceeded max time (${max_wait_seconds}s) - forcing circuit reset" + reset_circuit_breaker "recovery_timeout" + return 0 + fi # Re-init circuit breaker ...

As per coding guidelines, **/*.sh: "Use bash function comments documenting purpose, parameters, and return values for all functions" — this function is also missing parameter/return documentation.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@ralph_loop.sh` around lines 509 - 543, The wait_for_circuit_recovery function lacks an upper bound and documentation; update it to accept or use a MAX_WAIT_MINUTES (or MAX_ITERATIONS) constant and enforce a maximum loop duration/iteration count inside wait_for_circuit_recovery so that if get_circuit_state remains "$CB_STATE_OPEN" beyond the limit it logs an error, updates status (e.g., "circuit_breaker_failed" or "timeout"), and returns a non-zero value; also add a bash function comment block above wait_for_circuit_recovery describing purpose, parameters (loop_count, optional max wait), and return values (0 on recovered, non-zero on timeout/error), and reference functions/vars like init_circuit_breaker, get_circuit_state, CB_COOLDOWN_MINUTES, CB_STATE_FILE, parse_iso_to_epoch, and CALL_COUNT_FILE when implementing the timeout handling and status updates.

coderabbitai · 2026-02-18T11:13:46Z

ralph_loop.sh

+        if [[ -n "${ANTHROPIC_FOUNDRY_API_KEY:-}" ]]; then
+            export ANTHROPIC_FOUNDRY_API_KEY
+            log_status "INFO" "   API Key: ${ANTHROPIC_FOUNDRY_API_KEY:0:10}..."


⚠️ Potential issue | 🟠 Major

Logging partial API key is a security risk.

Line 1209 logs the first 10 characters of ANTHROPIC_FOUNDRY_API_KEY. Even a partial key in log files (which persist on disk at $LOG_DIR/ralph.log) increases exposure surface. Consider masking to just a few characters or logging only presence/absence.

Proposed fix

if [[ -n "${ANTHROPIC_FOUNDRY_API_KEY:-}" ]]; then export ANTHROPIC_FOUNDRY_API_KEY - log_status "INFO" " API Key: ${ANTHROPIC_FOUNDRY_API_KEY:0:10}..." + log_status "INFO" " API Key: ****${ANTHROPIC_FOUNDRY_API_KEY: -4}..." fi

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if [[ -n "${ANTHROPIC_FOUNDRY_API_KEY:-}" ]]; then

export ANTHROPIC_FOUNDRY_API_KEY

log_status "INFO" " API Key: ${ANTHROPIC_FOUNDRY_API_KEY:0:10}..."

if [[ -n "${ANTHROPIC_FOUNDRY_API_KEY:-}" ]]; then

export ANTHROPIC_FOUNDRY_API_KEY

log_status "INFO" " API Key: ****${ANTHROPIC_FOUNDRY_API_KEY: -4}..."

fi

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@ralph_loop.sh` around lines 1207 - 1209, The current code logs the first 10 characters of ANTHROPIC_FOUNDRY_API_KEY via the log_status call, exposing sensitive secret material; change the behavior in the block that checks ANTHROPIC_FOUNDRY_API_KEY so it does not emit partial keys to $LOG_DIR/ralph.log — instead log only presence/absence (e.g., "ANTHROPIC_FOUNDRY_API_KEY: set" or "not set") or, if you must show a hint, mask to a minimal safe slice such as a single character or last 4 chars; update the log_status invocation that references ANTHROPIC_FOUNDRY_API_KEY accordingly to avoid writing key material to logs.

coderabbitai · 2026-02-18T11:13:46Z

ralph_loop.sh

+        #
+        # Run in a subshell so cleanup() can kill the entire pipeline
+        # (timeout + stdbuf + claude + tee + jq) with a single process group
+        # kill on Ctrl+C. The subshell also avoids the errexit issue (Issue #175)
+        # where timeout's exit code 124 would silently kill the script under set -e.
+        (
+            set -o pipefail
+            portable_timeout ${timeout_seconds}s stdbuf -oL "${LIVE_CMD_ARGS[@]}" \
+                < /dev/null 2>&1 | stdbuf -oL tee "$output_file" | stdbuf -oL jq --unbuffered -j "$jq_filter" 2>/dev/null | tee "$LIVE_LOG_FILE"
+        ) &
+        CLAUDE_CHILD_PID=$!
+        wait $CLAUDE_CHILD_PID
+        exit_code=$?
+        CLAUDE_CHILD_PID=""


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Test 1: Verify bash set -e behavior with wait returning non-zero echo "=== Test 1: set -e + wait returning 124 ===" bash -c ' set -e ( exit 124 ) & pid=$! echo "STARTING" wait $pid echo "REACHED exit_code=$?" ' 2>&1 || echo "(script exited with code $?)" echo "" echo "=== Test 2: set -e + wait || exit_code=$? (proposed fix) ===" bash -c ' set -e exit_code=0 ( exit 124 ) & pid=$! echo "STARTING" wait $pid || exit_code=$? echo "REACHED exit_code=$exit_code" ' 2>&1 echo "" echo "=== Test 3: Testing inside a function with set -e ===" bash -c ' set -e test_func() { ( exit 124 ) & pid=$! echo "IN FUNC STARTING" wait $pid echo "IN FUNC REACHED" return 0 } test_func echo "CALLER REACHED" ' 2>&1 || echo "(function/script exited with code $?)"

Repository: frankbria/ralph-claude-code

Length of output: 356

🏁 Script executed:

# Now read the actual code context head -c 50000 ralph_loop.sh | tail -c 5000 | cat -n

Repository: frankbria/ralph-claude-code

Length of output: 5859

🏁 Script executed:

# Get broader context around lines 1319-1332 sed -n '1300,1360p' ralph_loop.sh | cat -n

Repository: frankbria/ralph-claude-code

Length of output: 3826

🏁 Script executed:

# Check how execute_claude_code is called and what the function returns sed -n '1770,1800p' ralph_loop.sh | cat -n

Repository: frankbria/ralph-claude-code

Length of output: 1834

🏁 Script executed:

# Check the start of execute_claude_code to see set -e status sed -n '1200,1250p' ralph_loop.sh | cat -n

Repository: frankbria/ralph-claude-code

Length of output: 2713

🏁 Script executed:

# Check line 6 to confirm set -e is enabled sed -n '1,20p' ralph_loop.sh | cat -n

Repository: frankbria/ralph-claude-code

Length of output: 984

wait $CLAUDE_CHILD_PID is subject to set -e — timeout (exit 124) terminates the script before timeout handling.

Under set -e (line 6), when wait $CLAUDE_CHILD_PID returns non-zero (timeout exit code 124), the script exits immediately at line 1331, never reaching exit_code=$? on line 1331. This bypasses all downstream timeout handling (line 1336), session extraction (line 1337+), and circuit breaker logic, defeating the fix for Issue #175.

Proposed fix — capture non-zero wait status without triggering errexit

) & CLAUDE_CHILD_PID=$! - wait $CLAUDE_CHILD_PID - exit_code=$? + wait $CLAUDE_CHILD_PID || exit_code=$? CLAUDE_CHILD_PID=""

With || exit_code=$?, the || operator ensures the compound command succeeds (preventing set -e exit), while capturing the actual exit code for proper handling downstream.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@ralph_loop.sh` around lines 1319 - 1332, The wait on CLAUDE_CHILD_PID can trigger set -e when the child exits non‑zero (e.g., timeout 124), so change the wait invocation to capture the child's exit status without causing an immediate script exit: run wait $CLAUDE_CHILD_PID with a fallthrough that assigns exit_code (e.g., use "wait $CLAUDE_CHILD_PID || exit_code=$?") so the script records the actual exit code for downstream timeout/session/circuit-breaker handling (references: CLAUDE_CHILD_PID, wait, exit_code, portable_timeout).

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (2)

ralph_loop.sh (1)

200-203: forward_tmux_env is missing a bash function documentation block.

Per coding guidelines, all *.sh functions must have a comment block documenting purpose, parameters, and return values. As per coding guidelines, "Use bash function comments documenting purpose, parameters, and return values for all functions".

📄 Suggested doc comment

+# forward_tmux_env - Forward configured environment variables into a tmux session.
+#
+# Reads RALPH_TMUX_ENV for a space-separated list of env var names or prefix
+# patterns (trailing '*'). Forwards matching variables via `tmux set-environment`.
+# When Foundry is disabled, actively unsets Foundry vars to prevent leakage
+# across concurrent sessions.
+#
+# Parameters:
+#   $1 - session_name: tmux session target name
+#
+# Returns: 0 always (individual tmux failures are suppressed with || true)
 forward_tmux_env() {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@ralph_loop.sh` around lines 200 - 203, Add a bash function documentation
block above the forward_tmux_env function that states the function's purpose
(forward specified environment variables into a tmux session), parameters (reads
RALPH_TMUX_ENV containing exact names and prefix patterns like "ANTHROPIC_*"),
and return behavior/exit codes (e.g., returns 0 on success, non‑zero on error).
Ensure the comment follows the project's bash doc style (purpose, parameters,
returns) and briefly notes behavior for exact names vs. wildcard prefixes and
any side effects (modifies tmux-environment using tmux set-environment).

tests/unit/test_cli_modern.bats (1)

838-869: Missing unit tests for forward_tmux_env, wait_for_circuit_recovery, and CLAUDE_MODEL passthrough.

This PR introduces two new public functions (forward_tmux_env, wait_for_circuit_recovery) and adds --model injection to build_claude_command. None of these have corresponding unit tests. Additionally, the inline build_claude_command copy in the setup() block (lines ~451–499) omits the new --model branch, so the existing build_claude_command tests do not exercise the model-override path at all.

As per coding guidelines, "Add unit tests for all new bash script functions using bats test framework."

For forward_tmux_env, tmux can be mocked; for wait_for_circuit_recovery, sleep and get_circuit_state can be stubbed. At minimum, add a test such as:

`@test` "build_claude_command includes --model when CLAUDE_MODEL is set" {
    export CLAUDE_CODE_CMD="claude"
    export CLAUDE_OUTPUT_FORMAT="json"
    export CLAUDE_ALLOWED_TOOLS=""
    export CLAUDE_USE_CONTINUE="false"
    export CLAUDE_MODEL="claude-opus-4-6"

    echo "Test prompt" > "$PROMPT_FILE"
    # Call the real implementation (not the stale inline copy)
    source "${BATS_TEST_DIRNAME}/../../ralph_loop.sh" --help 2>/dev/null || true
    build_claude_command "$PROMPT_FILE" "" ""

    local cmd_string="${CLAUDE_CMD_ARGS[*]}"
    [[ "$cmd_string" == *"--model"* ]]
    [[ "$cmd_string" == *"claude-opus-4-6"* ]]
}

`@test` "build_claude_command omits --model when CLAUDE_MODEL is empty" {
    export CLAUDE_MODEL=""
    echo "Test prompt" > "$PROMPT_FILE"
    build_claude_command "$PROMPT_FILE" "" ""

    local cmd_string="${CLAUDE_CMD_ARGS[*]}"
    [[ "$cmd_string" != *"--model"* ]]
}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/unit/test_cli_modern.bats` around lines 838 - 869, Add bats unit tests
covering the new public functions and the CLAUDE_MODEL passthrough: write tests
for forward_tmux_env (mock tmux environment variables and assert they are
forwarded), for wait_for_circuit_recovery (stub sleep and get_circuit_state to
simulate transient states and assert correct retry/exit behavior), and for
build_claude_command to assert it includes "--model <value>" when CLAUDE_MODEL
is set and omits it when empty; also update the stale inline copy of
build_claude_command in the setup() block so it either sources/uses the real
build_claude_command implementation (rather than the outdated copy) or add the
missing --model branch so tests exercise the model-override path.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ralph_loop.sh`:
- Around line 1449-1451: In the background execution path where we call wait on
the Claude PID (wait $claude_pid), prevent set -e from aborting before we record
the child's exit status: temporarily disable error-exit or absorb wait's failure
so we can assign exit_code=$? after wait (e.g., run wait with an || true or wrap
in set +e / set -e), then use that exit_code to detect portable_timeout (124)
and run the same timeout handling, circuit breaker and session extraction logic;
update the block that currently contains wait $claude_pid and
CLAUDE_CHILD_PID="" to ensure wait's exit code is captured into exit_code
regardless of set -e.
- Around line 1463-1472: The grep pattern uses GNU-only BRE alternation (\|) and
fails on BSD/macOS; update the authentication detection in the block that checks
"$output_file" by replacing the grep call in that if-statement (the line that
currently uses grep -q '"usage"\|"billing"\|"credits_remaining"') with a
portable ERE invocation (e.g. grep -E or egrep) or equivalent multiple -e
patterns, preserve the quiet flag and redirection (2>/dev/null), so the
condition correctly matches "usage", "billing" or "credits_remaining" across
platforms; keep the surrounding logic referencing output_file,
CLAUDE_CODE_USE_FOUNDRY and log_status unchanged.
- Line 1459: The echo that writes to PROGRESS_FILE uses an unquoted command
substitution which can be split on the space in the timestamp; update the write
so the timestamp substitution is quoted (e.g. use "$(date '+%Y-%m-%d
%H:%M:%S')") or switch to printf to build the JSON safely before redirecting to
"$PROGRESS_FILE" — target the echo line that writes to PROGRESS_FILE in
ralph_loop.sh and ensure the substitution is quoted to prevent word splitting.

In `@tests/unit/test_cli_modern.bats`:
- Around line 838-854: In the test function test "live mode pipeline runs in
subshell to isolate errexit (Issue `#175`)" replace GNU-only '\s' usage in the
grep that checks for an indented opening parenthesis so it becomes
POSIX-portable: update the grep against the live_block (the line echo
"$live_block" | grep -q '^\s*(') to use [[:space:]] (e.g. '^[[:space:]]*(') so
macOS/BSD grep behaves correctly; leave the other checks (') &' and
'CLAUDE_CHILD_PID=\$!') unchanged.
- Around line 856-869: The test "live mode pipeline captures exit code via wait"
currently checks for separate lines matching wait \$CLAUDE_CHILD_PID and
exit_code=\$? which allows the buggy two-line form; change the assertions to
require the compound form using the OR operator so the wait and exit capture
appear together (e.g. assert that the live_block contains a pattern where wait
\$CLAUDE_CHILD_PID and \|\| and exit_code=\$? occur in the same command); keep
the existing check that CLAUDE_CHILD_PID is cleared (CLAUDE_CHILD_PID="") and
update the grep/regex used to match the live_block accordingly so only the safe
"wait ... || exit_code=$?" form passes.

---

Duplicate comments:
In `@ralph_loop.sh`:
- Around line 209-231: The loop that sets tmux environment vars disables
globbing with set -f but can exit early if tmux set-environment fails (under set
-e), leaving globbing disabled; update the block that calls tmux set-environment
(both the prefix branch and the exact-match branch referencing variables
patterns and session_name and using tmux set-environment) to either append ||
true to the tmux commands or ensure set +f always runs (e.g., via a
trap/finally), so failures in tmux set-environment do not prevent restoring
globbing.
- Around line 1205-1209: The script is leaking a partial
ANTHROPIC_FOUNDRY_API_KEY value to logs via the log_status call; update the
block that checks ANTHROPIC_FOUNDRY_API_KEY so it does not print the key
contents (in ralph_loop.sh where ANTHROPIC_FOUNDRY_API_KEY and log_status are
used). Replace the current log_status that prints
${ANTHROPIC_FOUNDRY_API_KEY:0:10} with a non-secret message (e.g., "API key set"
or a fully masked representation like "*******" or only show a fixed-length
masked suffix/prefix) so no actual key material is emitted to logs.
- Around line 507-541: Add a timeout guard and docstring to
wait_for_circuit_recovery: document the function and its parameters, then add an
optional max_wait_seconds parameter (or default like CB_COOLDOWN_MINUTES*60*2)
and track elapsed time inside the while loop; after each sleep/update/re-init
(calls to init_circuit_breaker and get_circuit_state) check elapsed against
max_wait_seconds and if exceeded log_status an ERROR, update_status to a timeout
state, and return non-zero; keep existing cooldown logging (opened_at parsing
via parse_iso_to_epoch and CB_STATE_FILE) but ensure the loop breaks on timeout
to avoid infinite hang.
- Around line 50-51: The comment and runtime check for the environment flag
CLAUDE_CODE_USE_FOUNDRY are inconsistent (comment says set to "true" but the
script checks for "1"); make them consistent by either updating the script's
runtime conditional to accept "true" (e.g., case-insensitive check for
"true"/"1"/"yes") or changing the comment to instruct setting
CLAUDE_CODE_USE_FOUNDRY="1"; update the CLAUDE_CODE_USE_FOUNDRY check and the
comment in ralph_loop.sh (and any related README/.ralphrc note) so both use the
same canonical value and document accepted truthy forms.
- Around line 1322-1330: The wait invocation is still vulnerable to set -e:
change the line using wait on CLAUDE_CHILD_PID so the script captures the
child's exit status even if wait returns non-zero (e.g., timeout). Replace the
bare "wait $CLAUDE_CHILD_PID" with a guarded form such as "wait
$CLAUDE_CHILD_PID || exit_code=$?" (or an equivalent if-block) to ensure
exit_code is assigned when wait fails, keeping the existing CLAUDE_CHILD_PID and
exit_code variables intact.

---

Nitpick comments:
In `@ralph_loop.sh`:
- Around line 200-203: Add a bash function documentation block above the
forward_tmux_env function that states the function's purpose (forward specified
environment variables into a tmux session), parameters (reads RALPH_TMUX_ENV
containing exact names and prefix patterns like "ANTHROPIC_*"), and return
behavior/exit codes (e.g., returns 0 on success, non‑zero on error). Ensure the
comment follows the project's bash doc style (purpose, parameters, returns) and
briefly notes behavior for exact names vs. wildcard prefixes and any side
effects (modifies tmux-environment using tmux set-environment).

In `@tests/unit/test_cli_modern.bats`:
- Around line 838-869: Add bats unit tests covering the new public functions and
the CLAUDE_MODEL passthrough: write tests for forward_tmux_env (mock tmux
environment variables and assert they are forwarded), for
wait_for_circuit_recovery (stub sleep and get_circuit_state to simulate
transient states and assert correct retry/exit behavior), and for
build_claude_command to assert it includes "--model <value>" when CLAUDE_MODEL
is set and omits it when empty; also update the stale inline copy of
build_claude_command in the setup() block so it either sources/uses the real
build_claude_command implementation (rather than the outdated copy) or add the
missing --model branch so tests exercise the model-override path.

ralph_loop.sh

coderabbitai · 2026-02-18T12:42:03Z

ralph_loop.sh

        echo "$calls_made" > "$CALL_COUNT_FILE"

        # Clear progress file
        echo '{"status": "completed", "timestamp": "'$(date '+%Y-%m-%d %H:%M:%S')'"}' > "$PROGRESS_FILE"


⚠️ Potential issue | 🟡 Minor

SC2046: unquoted $(date) in echo — progress file write.

Shellcheck warns that the command substitution $(date '+%Y-%m-%d %H:%M:%S') is unquoted; word splitting on the space in the timestamp can produce two echo arguments. While echo concatenates them with a space and the JSON output is still valid today, it is brittle and flagged by the static analysis tool.

🔧 Proposed fix

- echo '{"status": "completed", "timestamp": "'$(date '+%Y-%m-%d %H:%M:%S')'"}' > "$PROGRESS_FILE" + echo '{"status": "completed", "timestamp": "'"$(date '+%Y-%m-%d %H:%M:%S')"'"}' > "$PROGRESS_FILE"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

echo '{"status": "completed", "timestamp": "'$(date '+%Y-%m-%d %H:%M:%S')'"}' > "$PROGRESS_FILE"

echo '{"status": "completed", "timestamp": "'"$(date '+%Y-%m-%d %H:%M:%S')"'"}' > "$PROGRESS_FILE"

🧰 Tools

🪛 Shellcheck (0.11.0)

[warning] 1459-1459: Quote this to prevent word splitting.

(SC2046)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@ralph_loop.sh` at line 1459, The echo that writes to PROGRESS_FILE uses an unquoted command substitution which can be split on the space in the timestamp; update the write so the timestamp substitution is quoted (e.g. use "$(date '+%Y-%m-%d %H:%M:%S')") or switch to printf to build the JSON safely before redirecting to "$PROGRESS_FILE" — target the echo line that writes to PROGRESS_FILE in ralph_loop.sh and ensure the substitution is quoted to prevent word splitting.

coderabbitai · 2026-02-18T12:42:04Z

ralph_loop.sh

+        # Verify actual authentication method used by checking response metadata
+        if [[ -f "$output_file" ]]; then
+            # Check for usage/billing fields (indicates personal account)
+            if grep -q '"usage"\|"billing"\|"credits_remaining"' "$output_file" 2>/dev/null; then
+                log_status "INFO" "🔍 Detected: Personal Account (usage/billing info in response)"
+            elif [[ "${CLAUDE_CODE_USE_FOUNDRY:-0}" == "1" ]]; then
+                # Foundry responses typically don't have usage fields
+                log_status "INFO" "🔍 Detected: Azure Foundry (no usage info in response)"
+            fi
+        fi


⚠️ Potential issue | 🟡 Minor

grep -q '"usage"\|...' uses GNU BRE alternation — not portable to macOS/BSD grep.

\| as alternation is a GNU grep extension to BRE. BSD grep (macOS default) treats \| as a literal two-character sequence, so the pattern never matches on macOS. The authentication-detection logging would silently be skipped on non-Linux hosts.

🔧 Portable fix using ERE

- if grep -q '"usage"\|"billing"\|"credits_remaining"' "$output_file" 2>/dev/null; then + if grep -qE '"usage"|"billing"|"credits_remaining"' "$output_file" 2>/dev/null; then

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Verify actual authentication method used by checking response metadata

if [[ -f "$output_file" ]]; then

# Check for usage/billing fields (indicates personal account)

if grep -q '"usage"\|"billing"\|"credits_remaining"' "$output_file" 2>/dev/null; then

log_status "INFO" "🔍 Detected: Personal Account (usage/billing info in response)"

elif [[ "${CLAUDE_CODE_USE_FOUNDRY:-0}" == "1" ]]; then

# Foundry responses typically don't have usage fields

log_status "INFO" "🔍 Detected: Azure Foundry (no usage info in response)"

fi

fi

# Verify actual authentication method used by checking response metadata

if [[ -f "$output_file" ]]; then

# Check for usage/billing fields (indicates personal account)

if grep -qE '"usage"|"billing"|"credits_remaining"' "$output_file" 2>/dev/null; then

log_status "INFO" "🔍 Detected: Personal Account (usage/billing info in response)"

elif [[ "${CLAUDE_CODE_USE_FOUNDRY:-0}" == "1" ]]; then

# Foundry responses typically don't have usage fields

log_status "INFO" "🔍 Detected: Azure Foundry (no usage info in response)"

fi

fi

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@ralph_loop.sh` around lines 1463 - 1472, The grep pattern uses GNU-only BRE alternation (\|) and fails on BSD/macOS; update the authentication detection in the block that checks "$output_file" by replacing the grep call in that if-statement (the line that currently uses grep -q '"usage"\|"billing"\|"credits_remaining"') with a portable ERE invocation (e.g. grep -E or egrep) or equivalent multiple -e patterns, preserve the quiet flag and redirection (2>/dev/null), so the condition correctly matches "usage", "billing" or "credits_remaining" across platforms; keep the surrounding logic referencing output_file, CLAUDE_CODE_USE_FOUNDRY and log_status unchanged.

coderabbitai · 2026-02-18T12:42:04Z

tests/unit/test_cli_modern.bats

+@test "live mode pipeline runs in subshell to isolate errexit (Issue #175)" {
+    # The live-mode pipeline runs in a backgrounded subshell so that
+    # timeout's exit code 124 never triggers set -e in the parent.
+    # This also enables Ctrl+C cleanup via CLAUDE_CHILD_PID.
    local script="${BATS_TEST_DIRNAME}/../../ralph_loop.sh"

    # Extract only the live-mode section (from "Live output mode enabled" to "End of Output")
    local live_block
    live_block=$(sed -n '/Live output mode enabled/,/End of Output/p' "$script")

-    # set +e must appear before set -o pipefail within the live-mode block
-    echo "$live_block" | grep -q 'set +e'
-    echo "$live_block" | grep -q 'set -o pipefail'
+    # Pipeline must be inside a subshell (parentheses) that is backgrounded
+    echo "$live_block" | grep -q '^\s*('
+    echo "$live_block" | grep -q ') &'

-    # Verify ordering: set +e comes first
-    local plus_e_line=$(echo "$live_block" | grep -n 'set +e' | head -1 | cut -d: -f1)
-    local pipefail_line=$(echo "$live_block" | grep -n 'set -o pipefail' | head -1 | cut -d: -f1)
-
-    [[ -n "$plus_e_line" ]]
-    [[ -n "$pipefail_line" ]]
-    [[ $plus_e_line -lt $pipefail_line ]]
+    # CLAUDE_CHILD_PID must be captured for cleanup
+    echo "$live_block" | grep -q 'CLAUDE_CHILD_PID=\$!'
 }


⚠️ Potential issue | 🟡 Minor

grep -q '^\s*(' uses \s — not portable to BSD grep (macOS).

\s in BRE is a GNU grep extension; BSD grep treats it as a literal two-character sequence. The test could silently pass for the wrong reason on macOS CI runners. Use POSIX [[:space:]] instead.

🔧 Portable fix

- echo "$live_block" | grep -q '^\s*(' + echo "$live_block" | grep -q '^[[:space:]]*('

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

@test "live mode pipeline runs in subshell to isolate errexit (Issue #175)" {

# The live-mode pipeline runs in a backgrounded subshell so that

# timeout's exit code 124 never triggers set -e in the parent.

# This also enables Ctrl+C cleanup via CLAUDE_CHILD_PID.

local script="${BATS_TEST_DIRNAME}/../../ralph_loop.sh"

# Extract only the live-mode section (from "Live output mode enabled" to "End of Output")

local live_block

live_block=$(sed -n '/Live output mode enabled/,/End of Output/p' "$script")

# set +e must appear before set -o pipefail within the live-mode block

echo "$live_block" | grep -q 'set +e'

echo "$live_block" | grep -q 'set -o pipefail'

# Pipeline must be inside a subshell (parentheses) that is backgrounded

echo "$live_block" | grep -q '^\s*('

echo "$live_block" | grep -q ') &'

# Verify ordering: set +e comes first

local plus_e_line=$(echo "$live_block" | grep -n 'set +e' | head -1 | cut -d: -f1)

local pipefail_line=$(echo "$live_block" | grep -n 'set -o pipefail' | head -1 | cut -d: -f1)

[[ -n "$plus_e_line" ]]

[[ -n "$pipefail_line" ]]

[[ $plus_e_line -lt $pipefail_line ]]

# CLAUDE_CHILD_PID must be captured for cleanup

echo "$live_block" | grep -q 'CLAUDE_CHILD_PID=\$!'

}

`@test` "live mode pipeline runs in subshell to isolate errexit (Issue `#175`)" {

# The live-mode pipeline runs in a backgrounded subshell so that

# timeout's exit code 124 never triggers set -e in the parent.

# This also enables Ctrl+C cleanup via CLAUDE_CHILD_PID.

local script="${BATS_TEST_DIRNAME}/../../ralph_loop.sh"

# Extract only the live-mode section (from "Live output mode enabled" to "End of Output")

local live_block

live_block=$(sed -n '/Live output mode enabled/,/End of Output/p' "$script")

# Pipeline must be inside a subshell (parentheses) that is backgrounded

echo "$live_block" | grep -q '^[[:space:]]*('

echo "$live_block" | grep -q ') &'

# CLAUDE_CHILD_PID must be captured for cleanup

echo "$live_block" | grep -q 'CLAUDE_CHILD_PID=\$!'

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/unit/test_cli_modern.bats` around lines 838 - 854, In the test function test "live mode pipeline runs in subshell to isolate errexit (Issue `#175`)" replace GNU-only '\s' usage in the grep that checks for an indented opening parenthesis so it becomes POSIX-portable: update the grep against the live_block (the line echo "$live_block" | grep -q '^\s*(') to use [[:space:]] (e.g. '^[[:space:]]*(') so macOS/BSD grep behaves correctly; leave the other checks (') &' and 'CLAUDE_CHILD_PID=\$!') unchanged.

coderabbitai · 2026-02-18T12:42:04Z

tests/unit/test_cli_modern.bats

+@test "live mode pipeline captures exit code via wait" {
+    # After the backgrounded subshell, exit code is captured via wait
    local script="${BATS_TEST_DIRNAME}/../../ralph_loop.sh"

-    # Extract only the live-mode section
    local live_block
    live_block=$(sed -n '/Live output mode enabled/,/End of Output/p' "$script")

-    # set +o pipefail and set -e must both exist in the live block
-    echo "$live_block" | grep -q 'set +o pipefail'
-    echo "$live_block" | grep -q 'set -e'
-
-    # Verify ordering: set -e comes after set +o pipefail
-    local pipefail_off_line=$(echo "$live_block" | grep -n 'set +o pipefail' | head -1 | cut -d: -f1)
-    local re_enable_line=$(echo "$live_block" | grep -n '^\s*set -e' | awk -F: -v threshold="$pipefail_off_line" '$1 > threshold {print $1; exit}')
+    # wait and exit_code capture must exist
+    echo "$live_block" | grep -q 'wait \$CLAUDE_CHILD_PID'
+    echo "$live_block" | grep -q 'exit_code=\$?'

-    [[ -n "$pipefail_off_line" ]]
-    [[ -n "$re_enable_line" ]]
-    [[ $re_enable_line -gt $pipefail_off_line ]]
+    # CLAUDE_CHILD_PID must be cleared after wait
+    echo "$live_block" | grep -q 'CLAUDE_CHILD_PID=""'
 }


⚠️ Potential issue | 🟠 Major

Test validates text presence but not the || operator — would pass with the unfixed, buggy form.

The test greps for wait \$CLAUDE_CHILD_PID and exit_code=\$? as two independent patterns. Both patterns match the current (unfixed) code that has:

wait $CLAUDE_CHILD_PID exit_code=$?

on separate lines — the very form that triggers set -e on timeout (exit 124). The test should instead assert that wait and the exit-code capture appear on the same compound command using ||, which is the only form that is immune to set -e. As written, the test provides a false sense of safety and won't catch a regression that reintroduces the bug.

🔧 Tighten the assertion to require the `||` form

- # wait and exit_code capture must exist - echo "$live_block" | grep -q 'wait \$CLAUDE_CHILD_PID' - echo "$live_block" | grep -q 'exit_code=\$?' + # wait and exit_code capture must be a single compound command (|| prevents set -e + # from killing the script when Claude times out with exit code 124 — Issue `#175`) + echo "$live_block" | grep -q 'wait \$CLAUDE_CHILD_PID || exit_code=\$?'

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

@test "live mode pipeline captures exit code via wait" {

# After the backgrounded subshell, exit code is captured via wait

local script="${BATS_TEST_DIRNAME}/../../ralph_loop.sh"

# Extract only the live-mode section

local live_block

live_block=$(sed -n '/Live output mode enabled/,/End of Output/p' "$script")

# set +o pipefail and set -e must both exist in the live block

echo "$live_block" | grep -q 'set +o pipefail'

echo "$live_block" | grep -q 'set -e'

# Verify ordering: set -e comes after set +o pipefail

local pipefail_off_line=$(echo "$live_block" | grep -n 'set +o pipefail' | head -1 | cut -d: -f1)

local re_enable_line=$(echo "$live_block" | grep -n '^\s*set -e' | awk -F: -v threshold="$pipefail_off_line" '$1 > threshold {print $1; exit}')

# wait and exit_code capture must exist

echo "$live_block" | grep -q 'wait \$CLAUDE_CHILD_PID'

echo "$live_block" | grep -q 'exit_code=\$?'

[[ -n "$pipefail_off_line" ]]

[[ -n "$re_enable_line" ]]

[[ $re_enable_line -gt $pipefail_off_line ]]

# CLAUDE_CHILD_PID must be cleared after wait

echo "$live_block" | grep -q 'CLAUDE_CHILD_PID=""'

}

`@test` "live mode pipeline captures exit code via wait" {

# After the backgrounded subshell, exit code is captured via wait

local script="${BATS_TEST_DIRNAME}/../../ralph_loop.sh"

local live_block

live_block=$(sed -n '/Live output mode enabled/,/End of Output/p' "$script")

# wait and exit_code capture must be a single compound command (|| prevents set -e

# from killing the script when Claude times out with exit code 124 — Issue `#175`)

echo "$live_block" | grep -q 'wait \$CLAUDE_CHILD_PID || exit_code=\$?'

# CLAUDE_CHILD_PID must be cleared after wait

echo "$live_block" | grep -q 'CLAUDE_CHILD_PID=""'

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/unit/test_cli_modern.bats` around lines 856 - 869, The test "live mode pipeline captures exit code via wait" currently checks for separate lines matching wait \$CLAUDE_CHILD_PID and exit_code=\$? which allows the buggy two-line form; change the assertions to require the compound form using the OR operator so the wait and exit capture appear together (e.g. assert that the live_block contains a pattern where wait \$CLAUDE_CHILD_PID and \|\| and exit_code=\$? occur in the same command); keep the existing check that CLAUDE_CHILD_PID is cleared (CLAUDE_CHILD_PID="") and update the grep/regex used to match the live_block accordingly so only the safe "wait ... || exit_code=$?" form passes.

…rl+C shutdown - Prevent Azure Foundry credentials from leaking across concurrent tmux sessions by unsetting ANTHROPIC_FOUNDRY_* vars at session level when CLAUDE_CODE_USE_FOUNDRY != 1, and unsetting them from process env in the personal account code path - Add CLAUDE_MODEL support in .ralphrc, passed as --model to Claude CLI - Fix Ctrl+C in --live mode: track child PIDs and kill the entire process tree in cleanup() to prevent orphaned Claude processes - Add forward_tmux_env() for configurable env var forwarding into tmux - Add wait_for_circuit_recovery() so OPEN circuit waits for cooldown instead of halting the loop permanently - Write call count at execution start for accurate monitor display Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tests/unit/test_cli_modern.bats (1)
893-902: ⚠️ Potential issue | 🟠 Major

Add unit tests for new functions forward_tmux_env and wait_for_circuit_recovery.

These functions introduced in ralph_loop.sh lack unit tests across the entire test suite. forward_tmux_env (line 206) handles prefix-pattern matching, exact environment variable matching, and conditional Foundry variable unsetting. wait_for_circuit_recovery (line 515) implements cooldown monitoring with state-transition logic. Both are testable with bats. Per coding guidelines: "Add unit tests for all new bash script functions using bats test framework."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/test_cli_modern.bats` around lines 893 - 902, Add bats unit tests
for the two new bash functions: for forward_tmux_env, write tests that cover
prefix-pattern matching (variables like FOO_BAR matching FOO_), exact name
matching, and the conditional unsetting behavior for Foundry-related vars
(ensure variables are forwarded or removed as per the function logic in
forward_tmux_env); for wait_for_circuit_recovery, add tests that simulate and
assert cooldown monitoring and state-transition logic (e.g., transitions from
open to closed with cooldowns, proper sleep/backoff behavior, and timeout
handling) ensuring both normal and edge cases are covered; implement these as
bats test cases in the existing test suite following its conventions so they run
with the test runner.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@tests/unit/test_cli_modern.bats`:
- Around line 893-902: Add bats unit tests for the two new bash functions: for
forward_tmux_env, write tests that cover prefix-pattern matching (variables like
FOO_BAR matching FOO_), exact name matching, and the conditional unsetting
behavior for Foundry-related vars (ensure variables are forwarded or removed as
per the function logic in forward_tmux_env); for wait_for_circuit_recovery, add
tests that simulate and assert cooldown monitoring and state-transition logic
(e.g., transitions from open to closed with cooldowns, proper sleep/backoff
behavior, and timeout handling) ensuring both normal and edge cases are covered;
implement these as bats test cases in the existing test suite following its
conventions so they run with the test runner.

---

Duplicate comments:
In `@ralph_loop.sh`:
- Around line 514-548: The wait_for_circuit_recovery function can loop forever
if get_circuit_state keeps returning OPEN; add a safety upper bound by computing
a max_wait_seconds (e.g., (CB_COOLDOWN_MINUTES + 10) * 60) and track
total_sleeped or start_time to break the loop and return non-zero after
exceeding the max; keep calling init_circuit_breaker and get_circuit_state as
now, but when timing out log an ERROR and update_status accordingly before
returning 1; also add a full shell-style comment block above
wait_for_circuit_recovery documenting purpose, parameters (loop_count), and
return values (0 on recovered, non-zero on timeout/error); use existing symbols
like get_circuit_state, init_circuit_breaker, CB_COOLDOWN_MINUTES,
CB_STATE_FILE, parse_iso_to_epoch to locate and implement the changes.
- Around line 203-245: In forward_tmux_env, ensure shell globbing (set -f) is
always restored even if a command exits early: either wrap the for/while loop
that runs under set -f in a subshell (so set -f is local to that subshell) or
make the tmux set-environment invocations resilient by appending || true to each
tmux set-environment call (both in the prefix loop and the exact-match branch)
so an error won’t cause premature exit before the set +f; reference set -f, set
+f, tmux set-environment, and the forward_tmux_env function when making the
change.
- Around line 1464-1466: When waiting for the background Claude process, avoid
allowing set -e to abort the script on nonzero wait status: replace the
unguarded "wait $claude_pid" with a guarded form like "wait $claude_pid || true"
so the script can then reliably capture the exit code into "exit_code=$?" and
still clear "CLAUDE_CHILD_PID=\"\""; update the block that references claude_pid
and CLAUDE_CHILD_PID accordingly.
- Around line 1212-1249: The log currently prints the first 10 characters of
ANTHROPIC_FOUNDRY_API_KEY (via the log_status call), which risks leaking
secrets; change the log in the Foundry branch to avoid revealing key material —
instead log presence only (e.g., "API Key: set") or a masked form showing only
the last 4 characters (e.g., "API Key: ****1234"); update the log_status
invocation that references ANTHROPIC_FOUNDRY_API_KEY so it does not include the
first-10-chars expansion and ensure any masking logic uses only read-only
reporting (no key reconstruction) and is applied wherever
ANTHROPIC_FOUNDRY_API_KEY is logged.
- Around line 1337-1345: The wait on CLAUDE_CHILD_PID must be protected from set
-e so a non‑zero child exit (e.g. timeout 124) doesn't abort before capturing
exit_code; modify the block around wait/exit_code to temporarily disable errexit
(e.g. set +e; wait "$CLAUDE_CHILD_PID"; exit_code=$?; set -e) or use wait
"$CLAUDE_CHILD_PID" || true then capture exit_code=$? immediately after, and
ensure CLAUDE_CHILD_PID is referenced/quoted consistently; keep the subshell
pipeline unchanged so PIPESTATUS inside it still reports per-command failures.

In `@tests/unit/test_cli_modern.bats`:
- Around line 838-854: The test uses '\s' in the grep regex which is
non‑portable; update the grep that checks for the opening subshell to use a
POSIX portable character class, e.g. change the pattern "^\s*(" to
"^[[:space:]]*(" (keep the other greps unchanged), referencing the live_block
variable and the grep calls that currently assert the subshell start and
background pattern.
- Around line 856-869: The test currently checks for 'wait $CLAUDE_CHILD_PID'
and 'exit_code=$?' as separate patterns which lets the unsafe two-line form
pass; update the assertions to require the safe combined form by matching the
literal sequence "wait $CLAUDE_CHILD_PID || exit_code=$?" (or otherwise verify
those tokens appear on the same line) and keep the existing check for clearing
CLAUDE_CHILD_PID (CLAUDE_CHILD_PID="") so the test ensures the wait is safe
under set -e and that the pid is cleared afterward.

macroscopeapp bot reviewed Feb 18, 2026

View reviewed changes

tudoanh force-pushed the fix/foundry-isolation-model-override-ctrlc branch from e85290f to 3a20578 Compare February 18, 2026 11:06

coderabbitai bot reviewed Feb 18, 2026

View reviewed changes

tudoanh force-pushed the fix/foundry-isolation-model-override-ctrlc branch from 3a20578 to c60b5a9 Compare February 18, 2026 12:31

coderabbitai bot reviewed Feb 18, 2026

View reviewed changes

tudoanh force-pushed the fix/foundry-isolation-model-override-ctrlc branch from c60b5a9 to a0cb953 Compare February 18, 2026 16:45

coderabbitai bot reviewed Feb 18, 2026

View reviewed changes

tudoanh closed this Feb 18, 2026


		[[ -z "$patterns" ]] && return 0

		for pattern in $patterns; do

	echo '{"status": "completed", "timestamp": "'$(date '+%Y-%m-%d %H:%M:%S')'"}' > "$PROGRESS_FILE"
	echo '{"status": "completed", "timestamp": "'"$(date '+%Y-%m-%d %H:%M:%S')"'"}' > "$PROGRESS_FILE"

Conversation

tudoanh commented Feb 18, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Context

Changes

Test plan

Summary by CodeRabbit

Uh oh!

macroscopeapp bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Add Foundry credential isolation, model override, Chrome flag, tmux env forwarding, circuit-breaker recovery wait, and reliable Ctrl+C shutdown to ralph_loop.sh

📍Where to Start

Uh oh!

macroscopeapp bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

macroscopeapp bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

macroscopeapp bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tudoanh commented Feb 18, 2026 •

edited by coderabbitai bot

Loading

macroscopeapp bot commented Feb 18, 2026 •

edited

Loading

Add Foundry credential isolation, model override, Chrome flag, tmux env forwarding, circuit-breaker recovery wait, and reliable Ctrl+C shutdown to `ralph_loop.sh`

coderabbitai bot commented Feb 18, 2026 •

edited

Loading