fix(loop): persist API call counter immediately, not only on success by frankbria · Pull Request #197 · frankbria/ralph-claude-code

frankbria · 2026-02-25T22:40:49Z

Summary

Fixes #196: Monitor dashboard shows stale API call counter (always 0/100)

Replace manual read+increment in execute_claude_code() with the existing increment_call_counter() function which writes to disk immediately
Remove the conditional write that only persisted on success (exit_code == 0)
Add 2 regression tests verifying the fix

Root Cause

execute_claude_code() read .call_count into a local variable and only wrote it back on success. Failed API calls (timeout, error, API limit) never persisted the increment, so the monitor always showed the stale pre-execution value.

Acceptance Criteria

Call counter persisted to disk immediately when incremented (before execution)
API calls counted regardless of success/failure
Monitor reflects the actual number of API calls made

Test Plan

2 new regression tests in test_cli_modern.bats (568 total, all passing)
Full test suite passes with zero failures
Structural tests verify the old buggy pattern is removed and the fix is in place

Closes #196

Summary by CodeRabbit

Tests
- Added a rate-limiting test suite (15 new tests) and expanded modernization tests (2 additional tests).
- Increased total test coverage to 568 tests (from 566).
Bug Fixes
- Fixed call-counter behavior related to stale increments (Issue Monitor dashboard shows stale API call counter (always 0/100) #196), with tests added to prevent regressions.

…196) Replace manual read+increment in execute_claude_code() with the existing increment_call_counter() function which writes to disk before execution. Remove the conditional write that only persisted on exit code 0. This fixes the monitor showing stale 0/100 when API calls fail (timeout, error, API limit) since failed calls still consume quota.

coderabbitai · 2026-02-25T22:41:06Z

Walkthrough

Centralized the API call counter increment by replacing inline read+increment in ralph_loop.sh with increment_call_counter(). Updated tests and manifests: added rate-limiting tests and increased total test counts to reflect new tests validating the counter-change.

Changes

Cohort / File(s)	Summary
Test Manifest & Counts `CLAUDE.md`	Updated test counts: total increased from 566 to 568; added `test_rate_limiting.bats` entry.
Loop Counter Refactor `ralph_loop.sh`	Replaced manual `cat`/increment and local persistence with a call to `increment_call_counter()`; removed the conditional success-path file write.
Tests for Counter Behavior `tests/unit/test_cli_modern.bats`	Added tests asserting `execute_claude_code` calls `increment_call_counter`, that manual increment pattern is removed, and that conditional write-on-success is absent.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

fix: progress detection improvements (#141, #144) #158 — Also modifies execute_claude_code; may overlap with counter increment or loop-start changes.
[P1] feat(cli): add modern CLI commands with JSON output support (Phase 1.1) #47 — Previous updates to execute_claude_code and related tests; likely touches similar logic and tests.

Poem

🐰 I hopped through loops and files with glee,

I nudged the counter to write promptly, see —
No more stale zeroes, the monitor sings,
Tests added, tiny carrot-praise it brings,
A rabbit's nod to cleaner counting things. 🥕

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix(loop): persist API call counter immediately, not only on success' clearly and specifically describes the main change: replacing conditional persistence with immediate counter persistence.
Linked Issues check	✅ Passed	The pull request fully addresses all coding requirements from issue `#196`: replaces manual counter increment with increment_call_counter() for immediate persistence, removes conditional success-path write, and adds regression tests to verify the fix.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to issue `#196`: ralph_loop.sh refactors counter persistence, test_cli_modern.bats adds regression tests, and CLAUDE.md updates test counts reflecting the new tests added.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/issue-196-stale-call-counter

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-02-25T22:42:14Z

Code Review

✅ Overall Assessment: APPROVED

This is a clean, well-tested fix for issue #196. The change is minimal, targeted, and correctly addresses the root cause.

Code Quality & Best Practices

Excellent:

✅ Reuses existing function instead of adding new code
✅ Follows DRY principle - the increment logic is already tested and validated
✅ Minimal change surface - only removes buggy code, doesn't add complexity
✅ Clear commit title follows conventional commit format

Potential Bugs or Issues

None identified. The fix is straightforward:

Before (buggy):

After (fixed):

The function (ralph_loop.sh:452-461):

Reads count from disk
Increments it
Writes back to disk immediately ()
Returns the new count

This ensures the counter is persisted regardless of Claude Code execution success/failure.

Performance Considerations

⚠️ Minor trade-off: Disk I/O on every API call

The fix writes to disk on every loop iteration instead of only on success. However:

The file write is a simple integer (tiny)
Rate limiting means max 100 writes/hour by default
The benefit (accurate monitoring) outweighs this trivial cost

Recommendation: Keep current implementation - the reliability gain is worth the minimal I/O cost.

Security Concerns

None. No security implications.

Test Coverage

Excellent structural tests:

✅ ****
- Verifies the new pattern is present ()
- Verifies the old buggy pattern is removed ()
✅ ****
- Verifies the conditional write comment is gone
- Confirms counter is now persisted before execution

Both tests validate the fix at the structural level, which is appropriate for this bug since:

The behavior is simple (disk write vs. no disk write)
The function () is already tested elsewhere
Regression detection is the primary goal

Documentation

✅ Updated CLAUDE.md:

Test count: 566 → 568 (correct)
Added issue Monitor dashboard shows stale API call counter (always 0/100) #196 reference to test_cli_modern.bats description

Summary

Strengths:

Minimal, targeted fix
Reuses well-tested existing code
Comprehensive test coverage for regression prevention
Clear documentation updates
Addresses the exact issue reported

No blockers. This PR is ready to merge.

github-actions · 2026-02-25T22:42:25Z

Code Review

Clean, targeted fix for Issue 196. The root cause analysis in the PR description is accurate — the previous code's conditional write inside if [ $exit_code -eq 0 ] was the direct cause of stale monitor readings.

What's Good

Core fix is correct. Delegating to increment_call_counter() (lines 452-461) is the right approach — that function already does read-increment-write atomically and echoes the new value, so the calls_made local variable remains available for the log message at line 1099. The removal of the conditional write is the key correctness improvement.

Minimal diff. The change is tightly scoped: 3 lines removed from execute_claude_code(), no changes elsewhere in the main logic path. This is exactly the right amount of change for a bug fix.

CLAUDE.md updated. Test count updated from 566 → 568 and the test_cli_modern.bats description correctly references the stale call counter fix.

Minor Observations

Test style is structural, not behavioral. Both new tests use grep to verify code structure (old pattern absent, new pattern present). This is consistent with how other regression tests in test_cli_modern.bats are written (e.g., the ALLOWED_TOOLS and errexit pipeline guard tests), so this is fine for consistency. A future improvement might be a behavioral test that simulates a failed execution and verifies the counter file is updated, but that's not required here.

Counter increments before execution begins. This is a minor semantic shift: the counter now increments before the Claude subprocess is launched, meaning a setup failure (e.g., invalid output_file path) would still count against the rate limit. In practice this edge case is unlikely, and counting attempted calls is arguably more correct than only counting successful ones.

No file locking in increment_call_counter(). The read-modify-write isn't atomic at the OS level, so two simultaneous Ralph instances could produce duplicate counts. This was equally true before this PR, and Ralph is single-instance by design — just noting it's not a regression.

Verdict

Approved. The fix is correct, the diff is minimal, and the tests follow established patterns in the codebase. No blockers.

github-actions · 2026-02-25T22:42:28Z

Review posted successfully to PR #197.

Summary:

✅ APPROVED - Clean, well-tested fix
Excellent code quality (reuses existing function, minimal changes)
No bugs or security issues
Minor performance trade-off (extra disk I/O) is acceptable for the reliability gain
Strong test coverage with regression prevention
Documentation properly updated

The PR is ready to merge.

opencode session | github run

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ralph_loop.sh`:
- Line 1089: Split the declaration and assignment for calls_made to avoid
masking increment_call_counter's exit status: replace "local
calls_made=$(increment_call_counter)" with "local calls_made" followed by
"calls_made=$(increment_call_counter)" and immediately check the command's exit
code ($?): if increment_call_counter fails, log an error via
log_status/update_status (or exit) and set a safe default for calls_made (e.g.,
0) so downstream calls to log_status and update_status receive a valid numeric
value; reference the increment_call_counter function, the calls_made variable,
and the log_status/update_status calls when making the change.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 622acb5 and 2f0f69d.

📒 Files selected for processing (3)

CLAUDE.md
ralph_loop.sh
tests/unit/test_cli_modern.bats

coderabbitai · 2026-02-25T22:45:05Z

ralph_loop.sh

    local loop_count=$1
-    local calls_made=$(cat "$CALL_COUNT_FILE" 2>/dev/null || echo "0")
-    calls_made=$((calls_made + 1))
+    local calls_made=$(increment_call_counter)


⚠️ Potential issue | 🟡 Minor

SC2155: Declare and assign calls_made separately to avoid masking the return value of increment_call_counter.

local calls_made=$(increment_call_counter) causes local to always return 0, silently masking any non-zero exit code from increment_call_counter (e.g., disk write failure on CALL_COUNT_FILE). If the subshell fails, calls_made becomes empty and the downstream log_status call at line 1099 prints "Call /100", and update_status writes an invalid JSON value for calls_made_this_hour.

🛠️ Proposed fix (SC2155)

- local calls_made=$(increment_call_counter) + local calls_made + calls_made=$(increment_call_counter)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

local calls_made=$(increment_call_counter)

local calls_made

calls_made=$(increment_call_counter)

🧰 Tools

🪛 Shellcheck (0.11.0)

[warning] 1089-1089: Declare and assign separately to avoid masking return values.

(SC2155)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@ralph_loop.sh` at line 1089, Split the declaration and assignment for calls_made to avoid masking increment_call_counter's exit status: replace "local calls_made=$(increment_call_counter)" with "local calls_made" followed by "calls_made=$(increment_call_counter)" and immediately check the command's exit code ($?): if increment_call_counter fails, log an error via log_status/update_status (or exit) and set a safe default for calls_made (e.g., 0) so downstream calls to log_status and update_status receive a valid numeric value; reference the increment_call_counter function, the calls_made variable, and the log_status/update_status calls when making the change.

Separate `local calls_made` from the assignment to avoid masking the return value of increment_call_counter().

github-actions · 2026-02-25T22:50:38Z

Code Review

APPROVED ✅

Clean, targeted fix for Issue #196. The SC2155 warning has already been addressed in a follow-up commit (1d8fbc3).

What's Working Well

Root cause correctly identified: The conditional write inside if [ exit_code -eq 0 ] prevented counter persistence on failed API calls
Minimal, scoped change: Only removes buggy code, reuses existing increment_call_counter() function
Good test coverage: Structural tests verify the fix (new pattern present, old patterns absent)
Documentation updated: CLAUDE.md test counts and issue references are accurate
SC2155 addressed: Declaration split from assignment in follow-up commit

Minor Observations

Test methodology is structural: Tests use grep to verify code patterns rather than behavioral tests. This is consistent with existing patterns in test_cli_modern.bats (e.g., ALLOWED_TOOLS, errexit tests) and provides clear regression markers.
No explicit error handling: If increment_call_counter() fails, calls_made would be empty. This is outside the scope of this issue and the function is simple enough that failures are unlikely.
Performance trade-off: Extra disk I/O on every iteration vs. only on success. This is trivial (100 writes/hour max) and justified by correctness.

Recommendation

This PR is ready to merge. The fix correctly addresses the reported issue, has appropriate regression tests, and follows repository conventions.

github-actions · 2026-02-25T22:50:44Z

Review posted to PR #197.

APPROVED - Clean fix for Issue #196. The SC2155 warning was correctly addressed in a follow-up commit, and the change properly fixes the stale counter bug with appropriate test coverage.

opencode session | github run

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

ralph_loop.sh (1)
452-461: ⚠️ Potential issue | 🔴 Critical

((calls_made++)) exits non-zero when calls_made=0, crashing the script on the first call each hour.

((expr)) returns exit code 1 when the expression evaluates to 0. ((calls_made++)) is a post-increment: when calls_made=0 the expression value is 0, so bash sets $?=1. With set -e inherited by the subshell, the function exits before writing the counter or echoing the result.

Before this PR, local calls_made=$(increment_call_counter) masked the exit code (the SC2155 bug). Now that the SC2155 fix is applied and the assignment is a plain calls_made=$(...), the non-zero subshell exit propagates and set -e in the outer script terminates the loop on the very first API call after every hourly reset (when init_call_tracking writes 0 to CALL_COUNT_FILE).
🐛 Proposed fix
-    ((calls_made++))
+    calls_made=$((calls_made + 1))
$((...)) arithmetic expansion never exits non-zero, so it is safe under set -e.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ralph_loop.sh` around lines 452 - 461, The post-increment using the
arithmetic command ((calls_made++)) can return exit code 1 when the expression
value is 0 and breaks scripts running under set -e; update the increment in the
increment_call_counter function to use arithmetic expansion (e.g., set
calls_made to $((calls_made+1))) so the operation never produces a non-zero exit
status, then continue writing the new value to CALL_COUNT_FILE and echoing it as
before.

♻️ Duplicate comments (1)

ralph_loop.sh (1)
1089-1090: SC2155 fix correctly applied — matches the proposed fix from the previous review.

The split declaration/assignment resolves the shellcheck warning and now correctly propagates increment_call_counter's exit code to the caller.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ralph_loop.sh` around lines 1089 - 1090, The SC2155 warning was addressed by
splitting declaration and assignment so the exit status from
increment_call_counter is preserved; keep the pattern used in ralph_loop.sh by
declaring the variable first (local calls_made) and then assigning via
calls_made=$(increment_call_counter) so that increment_call_counter's exit code
is propagated back to the caller and the shellcheck warning is avoided.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@ralph_loop.sh`:
- Around line 452-461: The post-increment using the arithmetic command
((calls_made++)) can return exit code 1 when the expression value is 0 and
breaks scripts running under set -e; update the increment in the
increment_call_counter function to use arithmetic expansion (e.g., set
calls_made to $((calls_made+1))) so the operation never produces a non-zero exit
status, then continue writing the new value to CALL_COUNT_FILE and echoing it as
before.

---

Duplicate comments:
In `@ralph_loop.sh`:
- Around line 1089-1090: The SC2155 warning was addressed by splitting
declaration and assignment so the exit status from increment_call_counter is
preserved; keep the pattern used in ralph_loop.sh by declaring the variable
first (local calls_made) and then assigning via
calls_made=$(increment_call_counter) so that increment_call_counter's exit code
is propagated back to the caller and the shellcheck warning is avoided.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2f0f69d and 1d8fbc3.

📒 Files selected for processing (1)

ralph_loop.sh

github-actions · 2026-02-25T22:52:27Z

Review Update — SC2155 Follow-up Commit

The prior reviews covered the core fix thoroughly and correctly approved it. This comment addresses the latest commit (1d8fbc3), which resolves CodeRabbit's actionable SC2155 finding.

SC2155 Fix — Correct and Complete

The latest commit splits the combined local-declaration-with-assignment pattern into two separate lines, which is the correct SC2155 fix. The local builtin always returns 0, so combining declaration with a command substitution assignment would silently swallow a non-zero exit status from increment_call_counter. Splitting them allows error propagation correctly.

Test Validity After the Split

The regression test greps for calls_made='$(increment_call_counter)' (without the local prefix). After the SC2155 split, the second line still matches that pattern, so both tests remain valid with no changes needed.

Minor Observation (Non-blocking)

The same function has other SC2155 candidates that weren't touched in this PR — for example the local timestamp=... line using a command substitution. These are out of scope for this fix and worth tracking separately. The date command is highly unlikely to fail, so the practical risk is low.

Verdict

No issues with the latest commit. The SC2155 concern raised by CodeRabbit is resolved, the tests remain valid, and the core fix from the initial PR is unchanged. Ready to merge.

coderabbitai bot reviewed Feb 25, 2026

View reviewed changes

fix(loop): split local declaration from assignment per SC2155

1d8fbc3

Separate `local calls_made` from the assignment to avoid masking the return value of increment_call_counter().

coderabbitai bot reviewed Feb 25, 2026

View reviewed changes

frankbria merged commit 6ff27b4 into main Feb 25, 2026
7 checks passed

frankbria deleted the fix/issue-196-stale-call-counter branch February 25, 2026 23:01

This was referenced Feb 25, 2026

Monitor dashboard shows stale API call counter (always 0/100) #196

Closed

Bug: set -e kills script on non-zero return from execute_claude_code — loop never retries #200

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(loop): persist API call counter immediately, not only on success#197

fix(loop): persist API call counter immediately, not only on success#197
frankbria merged 2 commits intomainfrom
fix/issue-196-stale-call-counter

frankbria commented Feb 25, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 25, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 25, 2026

Uh oh!

github-actions bot commented Feb 25, 2026

Uh oh!

github-actions bot commented Feb 25, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 25, 2026

Uh oh!

github-actions bot commented Feb 25, 2026

Uh oh!

github-actions bot commented Feb 25, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

github-actions bot commented Feb 25, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	local calls_made=$(increment_call_counter)
	local calls_made
	calls_made=$(increment_call_counter)

Conversation

frankbria commented Feb 25, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root Cause

Acceptance Criteria

Test Plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

github-actions bot commented Feb 25, 2026

Code Review

✅ Overall Assessment: APPROVED

Code Quality & Best Practices

Potential Bugs or Issues

Performance Considerations

Security Concerns

Test Coverage

Documentation

Summary

Uh oh!

github-actions bot commented Feb 25, 2026

Code Review

What's Good

Minor Observations

Verdict

Uh oh!

github-actions bot commented Feb 25, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 25, 2026

Code Review

What's Working Well

Minor Observations

Recommendation

Uh oh!

github-actions bot commented Feb 25, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review Update — SC2155 Follow-up Commit

SC2155 Fix — Correct and Complete

Test Validity After the Split

Minor Observation (Non-blocking)

Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

frankbria commented Feb 25, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 25, 2026 •

edited

Loading

github-actions bot commented Feb 25, 2026 •

edited

Loading