fix(#190): SESSION_CONTINUITY=false bugs + startup version check by timothy-20 · Pull Request #202 · frankbria/ralph-claude-code

timothy-20 · 2026-02-27T10:33:42Z

Summary

Fixes two bugs that occur when SESSION_CONTINUITY=false in .ralphrc, plus adds startup version check/auto-update.

Bug 1: Loop context not built in non-continue mode

build_loop_context() was gated inside CLAUDE_USE_CONTINUE=true block
Fix: Move context building outside the conditional so loop number, remaining tasks, and circuit breaker state are always injected

Bug 2: Question detection missing in headless mode

Claude asking questions (instead of acting) caused has_progress=false → circuit breaker OPEN after 3 loops
Fix: Add QUESTION_PATTERNS array + detect_questions() function, suppress no-progress counter on question loops, inject corrective guidance in next iteration

Startup version check (related)

Add check_claude_version() and check_claude_updates() at startup
Proper semver comparison with compare_semver() (major→minor→patch)
Configurable CLAUDE_AUTO_UPDATE (default true, set false for Docker/air-gapped)

Test cleanup

Remove 413 lines of structural/duplicate tests, consolidate coverage

Note on Bug 3 (stderr separation)

Bug 3 (stderr corrupting live output JSON stream) was originally part of this PR but has been moved to PR #208 (refactor/remove-set-e) where it fits naturally with the pipeline refactoring. This avoids merge conflicts between the two PRs.

Test plan

All tests pass on this branch
SESSION_CONTINUITY=false: verify loop context is injected
Question detection: verify asking_questions=true in response analysis
Version check: verify startup log shows version info

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Automatic Claude CLI version checks and optional auto-update (CLAUDE_AUTO_UPDATE, default true) with semantic-version validation at startup.
- Detection and counting of clarifying questions (asking_questions, question_count) and injection of corrective guidance into the next loop.
- Separate stderr capture and stricter generated config file permissions.
Bug Fixes
- Improved loop/progress logic to handle question-only scenarios and avoid miscounting non-progress.
Tests
- Expanded coverage for question detection and version/update flows; several session-related tests consolidated/removed.
Documentation
- CLI docs updated to document auto-update and post-validation behavior.

…rankbria#190) build_loop_context() was only called inside the CLAUDE_USE_CONTINUE=true conditional, so SESSION_CONTINUITY=false sessions received no loop context (loop number, remaining tasks, circuit breaker state, previous summary). This is especially problematic for fresh sessions that lack prior context. Move the call outside the conditional so context is always built and passed via --append-system-prompt. Session ID initialization remains gated behind CLAUDE_USE_CONTINUE as intended. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…nkbria#190) Add QUESTION_PATTERNS array and detect_questions() function to identify when Claude is asking questions instead of acting autonomously. Integrate detection into both JSON and text analysis paths, emitting asking_questions (bool) and question_count (int) fields in .response_analysis output. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…rankbria#190) When Claude asks questions instead of acting, hold consecutive_no_progress steady rather than incrementing. This prevents the circuit breaker from opening prematurely during question loops in headless mode. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ons (frankbria#190) When .response_analysis shows asking_questions=true, build_loop_context() now appends a headless-mode corrective message telling Claude to choose safe defaults and proceed autonomously instead of asking questions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… detect_questions - test_json_parsing: 5 tests for detect_questions() (pattern+?, no pattern, no ?, empty input, multiple questions) - test_exit_detection: 2 tests for analyze_response() asking_questions field - test_cli_modern: 2 tests for build_loop_context() corrective message + sync inline function copy with ralph_loop.sh - Fix: sanitize grep -c output with tr -d before arithmetic evaluation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Add asking_questions and question_count to structured fields list - Document detect_questions() and QUESTION_PATTERNS in response_analyzer - Add question loop suppression to circuit breaker thresholds section - Add corrective guidance to loop context documentation - Update test counts: 568 → 578 (json_parsing 57, exit_detection 55, cli_modern 71) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…estion detection (frankbria#190) work_summary is already populated from .json_parse_result at L372. The intermediate json_text_content variable and file re-read were unnecessary defensive code that duplicated the same jq extraction. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ng (frankbria#190) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

check_claude_version() was defined but never called in the startup sequence. Add both check_claude_version() and new check_claude_updates() calls after validate_claude_command(), ensuring CLI compatibility and freshness before entering the loop. Auto-update queries npm registry and attempts upgrade; failures are non-blocking (WARN only). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…rankbria#190) - Add compare_semver() helper with proper major→minor→patch sequential comparison, replacing integer arithmetic (major*10000+minor*100+patch) that produced incorrect results for patch numbers >= 100 - Refactor check_claude_version() and check_claude_updates() to use compare_semver(), eliminating duplicated version comparison logic - Add CLAUDE_AUTO_UPDATE config flag (default: true) to .ralphrc, templates, enable_core, and setup — allows disabling auto-update checks for air-gapped environments - Add 7 new tests: semver edge cases + CLAUDE_AUTO_UPDATE behavior - Update CLAUDE.md: test counts 588→595, document new config variable Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…nkbria#190) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…verage (frankbria#190) Remove 49 redundant tests: cross-file duplicates (6), structural grep-only tests covered by behavioral tests (26), and threshold/parameter duplicates consolidated into single tests (17). 595 -> 546 tests, all passing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai · 2026-02-27T10:34:03Z

Walkthrough

Adds question-detection to response analysis (asking_questions, question_count), feeds those signals into circuit-breaker/loop logic, introduces semantic-version checks and optional CLAUDE_AUTO_UPDATE with npm-based auto-update at startup, separates live-mode stderr to a timestamped log, and updates tests, templates, and generated .ralphrc (50 words).

Changes

Cohort / File(s)	Summary
Response analysis & question detection `lib/response_analyzer.sh`, `lib/circuit_breaker.sh`	Add `QUESTION_PATTERNS` and `detect_questions()`; emit `asking_questions` and `question_count` in analysis outputs; propagate metrics into exit-signal logic and suppress incrementing consecutive no-progress when Claude is asking questions.
Startup/versioning & auto-update `ralph_loop.sh`, `lib/enable_core.sh`, `setup.sh`, `templates/ralphrc.template`	Introduce `CLAUDE_AUTO_UPDATE` config (default true); add `compare_semver()` and `check_claude_updates()`; run `claude --version` semver checks and optionally auto-update via npm at startup; surface flag in templates and generated `.ralphrc`.
Execution flow, stderr & session plumbing `ralph_loop.sh`	Capture Claude CLI stderr to a timestamped log, adjust execute flow for live/background modes, always build/log loop context, and preserve/propagate session metadata and stderr storage.
Loop context & corrective guidance `ralph_loop.sh`, `lib/response_analyzer.sh`	Enhance `build_loop_context()` to retain previous loop context and inject corrective guidance when prior loop asked questions; include question metrics in loop payload.
Permissions & generated config `setup.sh`, `lib/enable_core.sh`, `templates/ralphrc.template`	Set generated `.ralphrc` file permissions to 600 and add `CLAUDE_AUTO_UPDATE` entry/documentation to templates and generated config.
Tests `tests/unit/test_cli_modern.bats`, `tests/unit/test_exit_detection.bats`, `tests/unit/test_json_parsing.bats`, `tests/unit/test_file_protection.bats`, `tests/unit/test_integrity_check.bats`, `tests/unit/test_session_continuity.bats`, `tests/unit/test_rate_limiting.bats`	Add `compare_semver()` helper and many tests for question detection, semver/version checks, and auto-update; consolidate/remove broad session-continuity and some integrity/rate-limiting tests; update JSON parsing and file-protection tests.
Docs `CLAUDE.md`	Document new question-detection fields (`asking_questions`, `question_count`), CLAUDE_AUTO_UPDATE behavior, stderr handling, and corrective loop-context messaging; update test notes and counts.

Sequence Diagram(s)

sequenceDiagram
  participant Startup as Startup Script
  participant NPM as npm Registry
  participant ClaudeCLI as Claude CLI
  participant Analyzer as Response Analyzer
  participant Circuit as Circuit Breaker / Loop Controller
  rect rgba(200,200,255,0.5)
  Startup->>ClaudeCLI: run `claude --version`
  Startup->>NPM: query latest `claude` version (if CLAUDE_AUTO_UPDATE)
  NPM-->>Startup: latest version
  Startup->>ClaudeCLI: optionally `npm install -g claude` (update)
  end
  rect rgba(200,255,200,0.5)
  Circuit->>ClaudeCLI: execute Claude run (live or bg)
  ClaudeCLI-->>Circuit: stdout (JSON/text) + stderr (captured to file)
  Circuit->>Analyzer: pass stdout for analysis
  Analyzer-->>Circuit: analysis JSON with `asking_questions` & `question_count`
  Circuit->>Circuit: update loop context, inject corrective guidance if questions
  Circuit-->>Startup: decide continue/exit based on analysis & circuit rules
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

docs: update CLAUDE.md and README.md with recent improvements #8 — Directly related changes to lib/response_analyzer.sh and lib/circuit_breaker.sh (question detection and circuit integration).
fix: Recognize STATUS: COMPLETE as progress in circuit breaker #140 — Overlaps on circuit-breaker progress-detection using .response_analysis fields and related no-progress logic.
fix(test): robust variable init in session continuity test 199 #133 — Related to session-continuity tests and RESPONSE_ANALYSIS_FILE usage; touches similar test scaffolding.

Poem

🐇 I sniffed the questions late at night,
I nudged the loop to keep things tight,
Versions checked and stderr tucked away,
Auto-update primed to save the day,
Hop, patch, repeat — the rabbit's delight!

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and specifically describes the main changes: fixing SESSION_CONTINUITY=false bugs and adding a startup version check, which are the core objectives of this PR.
Docstring Coverage	✅ Passed	Docstring coverage is 93.75% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

tests/unit/test_file_protection.bats (1)

187-194: Consider adding failure context for easier debugging.

The consolidated test correctly validates all critical paths and the exclusion of optional files. However, if an assertion fails, the test output won't indicate which specific path was missing. This can make debugging harder.

💡 Optional improvement for better failure diagnostics

 `@test` "RALPH_REQUIRED_PATHS contains all critical paths and excludes optional files" {
     local expected=(".ralph" ".ralph/PROMPT.md" ".ralph/fix_plan.md" ".ralph/AGENT.md" ".ralphrc")
     for path in "${expected[@]}"; do
-        [[ " ${RALPH_REQUIRED_PATHS[*]} " =~ " $path " ]]
+        [[ " ${RALPH_REQUIRED_PATHS[*]} " =~ " $path " ]] || fail "Missing required path: $path"
     done
     # Optional paths should NOT be required
-    [[ ! " ${RALPH_REQUIRED_PATHS[*]} " =~ "status.json" ]]
+    [[ ! " ${RALPH_REQUIRED_PATHS[*]} " =~ "status.json" ]] || fail "status.json should not be required"
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/unit/test_file_protection.bats` around lines 187 - 194, In the test
"RALPH_REQUIRED_PATHS contains all critical paths and excludes optional files",
replace the silent boolean checks in the for-loop with explicit assertions that
include the missing path on failure (e.g. use [[ " ${RALPH_REQUIRED_PATHS[*]} "
=~ " $path " ]] || fail "Missing required path: $path") so that when a required
entry in the expected array (defined as local expected=(...)) is absent the
failure message names the missing path; keep the final exclusion check similarly
explicit (e.g. [[ ! " ${RALPH_REQUIRED_PATHS[*]} " =~ "status.json" ]] || fail
"Optional path incorrectly required: status.json").

ralph_loop.sh (1)

639-682: check_claude_updates() has good error handling but npm command may fail silently in some environments.

The function correctly:

Short-circuits when CLAUDE_AUTO_UPDATE != "true"
Gracefully handles npm registry unreachability
Logs warnings on update failure with environment-specific guidance

However, npm update -g requires write permissions to the global npm prefix. On systems where npm was installed via a package manager or with root, this may fail silently even with 2>/dev/null.

Consider capturing and logging the npm update stderr for better debugging:

♻️ Optional: Capture npm update output for debugging

     # Auto-update attempt
     log_status "INFO" "Claude CLI update available: $installed_version → $latest_version. Attempting auto-update..."
-    if npm update -g `@anthropic-ai/claude-code` 2>/dev/null; then
+    local npm_output
+    if npm_output=$(npm update -g `@anthropic-ai/claude-code` 2>&1); then
         local new_version
         new_version=$($CLAUDE_CODE_CMD --version 2>/dev/null | grep -oE '[0-9]+\.[0-9]+\.[0-9]+' | head -1)
         log_status "SUCCESS" "Claude CLI updated: $installed_version → ${new_version:-$latest_version}"
         return 0
     fi
 
     # Auto-update failed — warn with environment-specific guidance
     log_status "WARN" "Claude CLI auto-update failed ($installed_version → $latest_version)"
+    [[ "${VERBOSE_PROGRESS:-false}" == "true" && -n "$npm_output" ]] && log_status "DEBUG" "npm output: $npm_output"
     log_status "WARN" "Update manually: npm update -g `@anthropic-ai/claude-code`"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@ralph_loop.sh` around lines 639 - 682, check_claude_updates currently
silences npm update output so failures can be invisible; modify the auto-update
block in check_claude_updates to capture npm update's stdout/stderr (e.g.,
capture to a variable), check its exit code, and log the captured output with
log_status (ERROR/WARN) when the update fails so you surface permission or other
errors; keep the existing successful-path behavior (re-read new_version from
$CLAUDE_CODE_CMD and log SUCCESS) but on failure include the captured npm output
and exit code in the WARN/ERROR messages to aid debugging.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/response_analyzer.sh`:
- Around line 395-400: The assignment from detect_questions can return exit code
1 (no questions) which will abort scripts running under set -e; in the
analyze_response code paths where detect_questions is used to set question_count
and asking_questions, guard the command substitution so failures don't propagate
— e.g., capture output in a conditional or append "|| true" to the substitution
(or explicitly check with if question_count=$(detect_questions); then :; else
question_count=0; fi) and then set asking_questions based on the resulting
numeric value; update both places where detect_questions is invoked (the
question_count/asking_questions assignment sites) to use this guarded pattern.

In `@setup.sh`:
- Around line 111-115: The generated .ralphrc file is created by the heredoc
ending with RALPHRCEOF in setup.sh but never permission-hardened; after the
block that writes .ralphrc (the heredoc that uses RALPHRCEOF) add a step to set
restrictive permissions (chmod 600) on $HOME/.ralphrc (and mirror the same chmod
600 change in ralph_enable.sh wherever .ralphrc is created) so the file is only
readable/writable by the owner.

---

Nitpick comments:
In `@ralph_loop.sh`:
- Around line 639-682: check_claude_updates currently silences npm update output
so failures can be invisible; modify the auto-update block in
check_claude_updates to capture npm update's stdout/stderr (e.g., capture to a
variable), check its exit code, and log the captured output with log_status
(ERROR/WARN) when the update fails so you surface permission or other errors;
keep the existing successful-path behavior (re-read new_version from
$CLAUDE_CODE_CMD and log SUCCESS) but on failure include the captured npm output
and exit code in the WARN/ERROR messages to aid debugging.

In `@tests/unit/test_file_protection.bats`:
- Around line 187-194: In the test "RALPH_REQUIRED_PATHS contains all critical
paths and excludes optional files", replace the silent boolean checks in the
for-loop with explicit assertions that include the missing path on failure (e.g.
use [[ " ${RALPH_REQUIRED_PATHS[*]} " =~ " $path " ]] || fail "Missing required
path: $path") so that when a required entry in the expected array (defined as
local expected=(...)) is absent the failure message names the missing path; keep
the final exclusion check similarly explicit (e.g. [[ ! "
${RALPH_REQUIRED_PATHS[*]} " =~ "status.json" ]] || fail "Optional path
incorrectly required: status.json").

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6ff27b4 and 0492ea4.

📒 Files selected for processing (14)

CLAUDE.md
lib/circuit_breaker.sh
lib/enable_core.sh
lib/response_analyzer.sh
ralph_loop.sh
setup.sh
templates/ralphrc.template
tests/unit/test_cli_modern.bats
tests/unit/test_exit_detection.bats
tests/unit/test_file_protection.bats
tests/unit/test_integrity_check.bats
tests/unit/test_json_parsing.bats
tests/unit/test_rate_limiting.bats
tests/unit/test_session_continuity.bats

💤 Files with no reviewable changes (3)

tests/unit/test_rate_limiting.bats
tests/unit/test_session_continuity.bats
tests/unit/test_integrity_check.bats

lib/response_analyzer.sh

setup.sh

…frankbria#190) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

🧹 Nitpick comments (1)

ralph_loop.sh (1)

639-682: Consider preserving npm update error output for debugging.

The function correctly implements non-blocking auto-updates with graceful fallback. However, when npm update fails, discarding stderr (Line 670) makes it harder to diagnose the root cause.

💡 Optional: Capture npm update errors for debugging

     # Auto-update attempt
     log_status "INFO" "Claude CLI update available: $installed_version → $latest_version. Attempting auto-update..."
-    if npm update -g `@anthropic-ai/claude-code` 2>/dev/null; then
+    local update_err
+    if update_err=$(npm update -g `@anthropic-ai/claude-code` 2>&1); then
         local new_version
         new_version=$($CLAUDE_CODE_CMD --version 2>/dev/null | grep -oE '[0-9]+\.[0-9]+\.[0-9]+' | head -1)
         log_status "SUCCESS" "Claude CLI updated: $installed_version → ${new_version:-$latest_version}"
         return 0
     fi

     # Auto-update failed — warn with environment-specific guidance
     log_status "WARN" "Claude CLI auto-update failed ($installed_version → $latest_version)"
+    [[ -n "$update_err" && "$VERBOSE_PROGRESS" == "true" ]] && log_status "DEBUG" "npm error: $update_err"
     log_status "WARN" "Update manually: npm update -g `@anthropic-ai/claude-code`"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@ralph_loop.sh` around lines 639 - 682, In check_claude_updates(), preserve
and log npm update error output instead of discarding stderr; capture the stderr
from the failing npm update -g `@anthropic-ai/claude-code` command into a variable
(e.g., npm_err) when the update fails, and include that output in the subsequent
log_status "WARN" messages (or a dedicated log_status "DEBUG"/"ERROR") alongside
the existing "$installed_version → $latest_version" message and the manual
update hints; keep the current success path using new_version unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@ralph_loop.sh`:
- Around line 639-682: In check_claude_updates(), preserve and log npm update
error output instead of discarding stderr; capture the stderr from the failing
npm update -g `@anthropic-ai/claude-code` command into a variable (e.g., npm_err)
when the update fails, and include that output in the subsequent log_status
"WARN" messages (or a dedicated log_status "DEBUG"/"ERROR") alongside the
existing "$installed_version → $latest_version" message and the manual update
hints; keep the current success path using new_version unchanged.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0492ea4 and eb98ace.

📒 Files selected for processing (1)

ralph_loop.sh

….ralphrc (frankbria#190) - Wrap detect_questions() calls in if-guards to prevent set -e from aborting on exit code 1 (no questions found) - Add chmod 600 to .ralphrc in setup.sh for both lib and fallback paths Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/response_analyzer.sh`:
- Around line 40-53: The current logic in response_analyzer.sh prematurely bails
out and later requires a '?' for each pattern match, which misses unpunctuated
asks; remove or relax the quick bail-out that checks for '?' and change the loop
that counts matches so it counts occurrences of QUESTION_PATTERNS in $content
even if they lack a question mark (use case-insensitive matching of each pattern
and default to 0 on no matches), while still also counting explicit '?'
occurrences separately if desired; update references to QUESTION_PATTERNS, the
variable content, and question_count so pattern matches increment question_count
without depending on '?' being present.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eb98ace and a20e7b9.

📒 Files selected for processing (2)

lib/response_analyzer.sh
setup.sh

🚧 Files skipped from review as they are similar to previous changes (1)

setup.sh

lib/response_analyzer.sh

…e patterns (frankbria#190) Remove the ? bail-out and ? filter in detect_questions() so that declarative wait patterns (e.g. "Please confirm", "Awaiting input") are detected without requiring a question mark. Use grep -ciw for word-boundary matching to prevent false positives from substring overlap (e.g. "should implement" no longer matches "should I"). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

timothy-20 · 2026-02-28T09:39:20Z

Note on merge order

This PR has overlapping changes with #208 (refactor/remove-set-e) in the live mode pipeline area of ralph_loop.sh. Specifically, this PR adds set +e / set -o pipefail as a workaround for set -e behavior, while #208 removes set -e entirely — making that workaround unnecessary.

Suggested merge order to minimize conflicts:

fix(tests): use bare mktemp -d instead of hardcoded /tmp paths #203, fix(tests): use UTC in get_past_timestamp to fix cooldown tests on non-UTC systems #204 — independent, any order
refactor(loop): remove set -e in favor of explicit error handling #208 — set -e removal
This PR (fix(#190): SESSION_CONTINUITY=false bugs + startup version check #202) last — I'll rebase and remove the set +e workaround after refactor(loop): remove set -e in favor of explicit error handling #208 is merged

Happy to rebase at any point once you've merged #208.

…nkbria#208) Bug 3 (stderr separation: 2>&1 → 2>"$stderr_file") is already implemented in PR frankbria#208 (refactor/remove-set-e). Removing it from this PR eliminates the merge conflict between frankbria#202 and frankbria#208 in the ralph_loop.sh pipeline area and prevents duplicate changes. Changes: - ralph_loop.sh: revert pipeline to 2>&1, remove stderr_file var and stderr logging block - test_cli_modern.bats: revert grep patterns to 2>&1, remove 3 stderr separation tests - CLAUDE.md: remove stderr separation paragraph and test description Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

CLAUDE.md (1)
535-556: ⚠️ Potential issue | 🟡 Minor

Update test file counts in CLAUDE.md to match actual test suite.

The documentation has three incorrect test counts:

test_cli_modern.bats: Change from 82 to 79

test_json_parsing.bats: Change from 50 to 52

Total test count: Change from 546 to 545

Current vs Actual counts

File Documented Actual Status

test_cli_modern.bats 82 79 ✗ Update needed

test_json_parsing.bats 50 52 ✗ Update needed

All others 16 files 16 files ✓ Match

Total 546 545 ✗ Update needed
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@CLAUDE.md` around lines 535 - 556, Update the test counts in the CLAUDE.md
tests table: change the entry for `test_cli_modern.bats` from 82 to 79, change
the entry for `test_json_parsing.bats` from 50 to 52, and update the total tests
count at the top of the table from 546 to 545 so the documented totals match the
actual suite.

🧹 Nitpick comments (2)

tests/unit/test_cli_modern.bats (1)

110-112: Test function differs from actual implementation for checkbox matching.

The test's inline build_loop_context() uses grep -c "^- \[ \]" but the actual implementation at ralph_loop.sh:744 uses grep -cE "^[[:space:]]*- \[ \]" to support indented checkboxes (Bug #3 Fix comment). The test fixtures use non-indented checkboxes so tests pass, but this doesn't verify the indentation support.
🔧 Optional: Align test function with implementation
         if [[ -f "$RALPH_DIR/fix_plan.md" ]]; then
-            local incomplete_tasks=$(grep -c "^- \[ \]" "$RALPH_DIR/fix_plan.md" 2>/dev/null || echo "0")
+            local incomplete_tasks=$(grep -cE "^[[:space:]]*- \[ \]" "$RALPH_DIR/fix_plan.md" 2>/dev/null || echo "0")
             context+="Remaining tasks: ${incomplete_tasks}. "
         fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/test_cli_modern.bats` around lines 110 - 112, The test's
build_loop_context() uses grep -c "^- \[ \]" which only matches non-indented
checkboxes; update the test to mirror the implementation by using grep -cE
"^[[:space:]]*- \[ \]" so indented checkboxes are counted (match the behavior of
the ralph_loop.sh build_loop_context at the implementation around line 744);
replace the current grep invocation in the test with this extended-regex form
and keep the existing fallback (|| echo "0").

ralph_loop.sh (1)

651-657: Consider adding a timeout to the npm registry query.

The npm view command can hang for extended periods when the npm registry is unreachable or slow (common in flaky network conditions). This could delay Ralph startup significantly even though the auto-update is meant to be non-blocking.

🔧 Optional: Add timeout protection

     # Query latest version from npm registry
     local latest_version
-    latest_version=$(npm view `@anthropic-ai/claude-code` version 2>/dev/null)
+    latest_version=$(timeout 5 npm view `@anthropic-ai/claude-code` version 2>/dev/null)
     if [[ -z "$latest_version" ]]; then
         log_status "INFO" "Could not check for Claude CLI updates (npm registry unreachable)"
         return 0
     fi

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@ralph_loop.sh` around lines 651 - 657, The npm registry query using "npm view
`@anthropic-ai/claude-code` version" can hang; wrap that call with a timeout
(e.g., the GNU timeout utility) so latest_version is obtained with a bounded
wait, check the command exit status for timeout/failure, and if it timed out or
failed keep the existing behavior of calling log_status "INFO" "Could not check
for Claude CLI updates (npm registry unreachable)" and returning 0; update the
invocation that sets latest_version and the subsequent conditional to handle the
timeout/failure path.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@CLAUDE.md`:
- Around line 535-556: Update the test counts in the CLAUDE.md tests table:
change the entry for `test_cli_modern.bats` from 82 to 79, change the entry for
`test_json_parsing.bats` from 50 to 52, and update the total tests count at the
top of the table from 546 to 545 so the documented totals match the actual
suite.

---

Nitpick comments:
In `@ralph_loop.sh`:
- Around line 651-657: The npm registry query using "npm view
`@anthropic-ai/claude-code` version" can hang; wrap that call with a timeout
(e.g., the GNU timeout utility) so latest_version is obtained with a bounded
wait, check the command exit status for timeout/failure, and if it timed out or
failed keep the existing behavior of calling log_status "INFO" "Could not check
for Claude CLI updates (npm registry unreachable)" and returning 0; update the
invocation that sets latest_version and the subsequent conditional to handle the
timeout/failure path.

In `@tests/unit/test_cli_modern.bats`:
- Around line 110-112: The test's build_loop_context() uses grep -c "^- \[ \]"
which only matches non-indented checkboxes; update the test to mirror the
implementation by using grep -cE "^[[:space:]]*- \[ \]" so indented checkboxes
are counted (match the behavior of the ralph_loop.sh build_loop_context at the
implementation around line 744); replace the current grep invocation in the test
with this extended-regex form and keep the existing fallback (|| echo "0").

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 69f4792 and daed59c.

📒 Files selected for processing (3)

CLAUDE.md
ralph_loop.sh
tests/unit/test_cli_modern.bats

timothy-20 · 2026-02-28T13:22:45Z

@coderabbitai review

coderabbitai · 2026-02-28T13:22:54Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

- CLAUDE.md: correct test counts (cli_modern 82→79, json_parsing 50→52, total 546→545) - ralph_loop.sh: add portable_timeout to npm view query to prevent hanging - ralph_loop.sh: capture npm update stderr for debugging on failure - test_cli_modern.bats: align grep pattern with implementation (support indented checkboxes) - test_file_protection.bats: add explicit fail messages for better diagnostics Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

timothy-20 · 2026-02-28T13:45:15Z

@coderabbitai resume

coderabbitai · 2026-02-28T13:45:24Z

✅ Actions performed

Reviews resumed.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@CLAUDE.md`:
- Line 180: Header "Run all tests (546 tests)" is inconsistent with the test
summary table showing "545 tests total"; update the literal count in the header
(the string "Run all tests (546 tests)" in CLAUDE.md) to "Run all tests (545
tests)" so the top-of-file summary matches the detailed table and commit
message.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between daed59c and eb4e7e7.

📒 Files selected for processing (4)

CLAUDE.md
ralph_loop.sh
tests/unit/test_cli_modern.bats
tests/unit/test_file_protection.bats

🚧 Files skipped from review as they are similar to previous changes (3)

tests/unit/test_file_protection.bats
ralph_loop.sh
tests/unit/test_cli_modern.bats

CLAUDE.md

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

timothy-20 · 2026-02-28T14:22:10Z

All CodeRabbit review feedback has been addressed:

Actionable (2): detect_questions set-e guard + .ralphrc chmod 600 → fixed in a20e7b9
Actionable (1): ? requirement relaxed in detect_questions → fixed in 69f4792
Actionable (1): Test count typo in CLAUDE.md (546→545) → fixed in 477e824
Nitpick: npm view timeout → already applied (portable_timeout 5s) in eb4e7e7
Nitpick: npm update stderr capture → already applied (update_output variable) in eb4e7e7
Nitpick: Test grep pattern for indented checkboxes → already applied (grep -cE) in eb4e7e7
Nitpick: || fail messages in test assertions → already applied in eb4e7e7
Outside diff: CLAUDE.md test table counts (79, 52, 545) → already correct in eb4e7e7

@coderabbitai resolve

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

CLAUDE.md (1)

229-233: Scope the auto-update command to the Claude package explicitly.

Line 229 says startup attempts npm update -g, which is ambiguous and can imply updating all global packages. Please ensure docs (and implementation) use a package-scoped command (for example, npm update -g @anthropic-ai/claude-code``) to avoid unintended global changes.

#!/bin/bash
# Verify whether implementation uses unscoped global npm update.
# Expected: update/install commands are scoped to the Claude Code package.
rg -n -C2 'npm (update|install)\s+-g' 
rg -n -C2 '@anthropic(-ai)?/claude-code|claude-code'

📌 Suggested doc wording update

-- `CLAUDE_AUTO_UPDATE` controls whether Ralph checks npm registry and attempts `npm update -g` at startup
+- `CLAUDE_AUTO_UPDATE` controls whether Ralph checks npm registry and attempts a package-scoped update (e.g., `npm update -g `@anthropic-ai/claude-code``) at startup

Based on learnings: Documentation MUST remain synchronized with codebase.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@CLAUDE.md` around lines 229 - 233, The docs and startup auto-update
invocation currently reference a generic npm global update (`npm update -g`);
change both the CLAUDE_AUTO_UPDATE documentation text and the code that runs the
update command so they explicitly use the Claude package scope (e.g. `npm update
-g `@anthropic-ai/claude-code`` or the appropriate scoped package name) instead of
an unscoped global update; locate references to the CLAUDE_AUTO_UPDATE config
and any code that executes shell commands containing `npm update -g` or `npm
install -g` and replace the command string with the package-scoped form, and
update the CLAUDE.md wording to show the scoped command so docs and
implementation remain synchronized.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@CLAUDE.md`:
- Line 230: Replace the substring "200-500ms" in the sentence that reads
"**Local workstation / home server**: Keep `true` (default) — CLI updates
include bug fixes and new features that improve Ralph's effectiveness. The
200-500ms startup overhead is negligible for loops that run hours" with "200–500
ms" (use an en dash between numbers and a space before the unit) so the phrase
becomes "The 200–500 ms startup overhead...".

---

Nitpick comments:
In `@CLAUDE.md`:
- Around line 229-233: The docs and startup auto-update invocation currently
reference a generic npm global update (`npm update -g`); change both the
CLAUDE_AUTO_UPDATE documentation text and the code that runs the update command
so they explicitly use the Claude package scope (e.g. `npm update -g
`@anthropic-ai/claude-code`` or the appropriate scoped package name) instead of an
unscoped global update; locate references to the CLAUDE_AUTO_UPDATE config and
any code that executes shell commands containing `npm update -g` or `npm install
-g` and replace the command string with the package-scoped form, and update the
CLAUDE.md wording to show the scoped command so docs and implementation remain
synchronized.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eb4e7e7 and 477e824.

📒 Files selected for processing (1)

CLAUDE.md

CLAUDE.md

coderabbitai · 2026-02-28T14:25:21Z

@timothy-20: Thank you for the comprehensive update! I can see you've thoroughly addressed all the review feedback:

✅ Actionable items fixed:

detect_questions set-e guard + .ralphrc chmod 600
? requirement relaxed in detect_questions
Test count corrections in CLAUDE.md

✅ Nitpicks applied:

npm view timeout handling
npm update stderr capture
Test grep patterns and fail messages
Test table count updates

All review comments will be resolved.

🧠 Learnings used

Learnt from: CR
Repo: frankbria/ralph-claude-code PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-25T22:58:06.960Z
Learning: Applies to ralph_loop.sh : Use `set +e` around live mode pipeline and re-enable `set -e` after `PIPESTATUS` capture in ralph_loop.sh to avoid silent exit on timeout

Learnt from: CR
Repo: frankbria/ralph-claude-code PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-25T22:58:06.960Z
Learning: Applies to ralph_loop.sh : Implement three-layer API 5-hour limit detection in ralph_loop.sh: Layer 1 timeout guard (code 124 → code 1), Layer 2 structural JSON detection (primary), Layer 3 filtered text fallback (tail -30 with field filtering)

Learnt from: CR
Repo: frankbria/ralph-claude-code PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-25T22:58:06.960Z
Learning: Applies to ralph_loop.sh : Use granular Bash tool patterns in default CLAUDE_ALLOWED_TOOLS like `Bash(git add *)`, `Bash(git commit *)` instead of `Bash(git *)` to prevent destructive commands in ralph_loop.sh

Learnt from: CR
Repo: frankbria/ralph-claude-code PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-25T22:58:06.960Z
Learning: Applies to ralph_loop.sh : Preserve Claude Code sessions in .ralph/.claude_session_id for context continuity across ralph_loop.sh iterations

Learnt from: CR
Repo: frankbria/ralph-claude-code PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-25T22:58:06.960Z
Learning: Applies to ralph_loop.sh : Use `--continue` flag to maintain Claude Code session context across ralph_loop.sh iterations

Learnt from: CR
Repo: frankbria/ralph-claude-code PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-25T22:58:06.960Z
Learning: Update implementation sections in CLAUDE.md file when new Ralph loop behaviors or patterns are introduced

✅ Actions performed

Comments resolved. Auto-approval is disabled; enable reviews.request_changes_workflow to approve automatically.

timothy-20 and others added 12 commits February 27, 2026 19:30

fix(loop): separate stderr from live mode JSON stream to fix jq parsi…

13b2413

…ng (frankbria#190) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs(loop): add environment-specific CLAUDE_AUTO_UPDATE guidance (fra…

ade55f5

…nkbria#190) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai bot reviewed Feb 27, 2026

View reviewed changes

lib/response_analyzer.sh Show resolved Hide resolved

setup.sh Show resolved Hide resolved

fix(loop): clean up empty stderr files after each live mode iteration (…

eb98ace

…frankbria#190) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai bot reviewed Feb 27, 2026

View reviewed changes

lib/response_analyzer.sh Outdated Show resolved Hide resolved

coderabbitai bot reviewed Feb 28, 2026

View reviewed changes

CLAUDE.md Outdated Show resolved Hide resolved

docs: fix test count typo in CLAUDE.md (546→545) (frankbria#190)

477e824

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai bot reviewed Feb 28, 2026

View reviewed changes

CLAUDE.md Show resolved Hide resolved

File	Documented	Actual	Status
`test_cli_modern.bats`	82	79	✗ Update needed
`test_json_parsing.bats`	50	52	✗ Update needed
All others	16 files	16 files	✓ Match
Total	546	545	✗ Update needed

Conversation

timothy-20 commented Feb 27, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Bug 1: Loop context not built in non-continue mode

Bug 2: Question detection missing in headless mode

Startup version check (related)

Test cleanup

Note on Bug 3 (stderr separation)

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

timothy-20 commented Feb 28, 2026

Note on merge order

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

timothy-20 commented Feb 28, 2026

Uh oh!

coderabbitai bot commented Feb 28, 2026

Uh oh!

timothy-20 commented Feb 28, 2026

Uh oh!

coderabbitai bot commented Feb 28, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

timothy-20 commented Feb 28, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot commented Feb 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

timothy-20 commented Feb 27, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 27, 2026 •

edited

Loading