Skip to content

fix(loop): persist API call counter immediately, not only on success#197

Merged
frankbria merged 2 commits intomainfrom
fix/issue-196-stale-call-counter
Feb 25, 2026
Merged

fix(loop): persist API call counter immediately, not only on success#197
frankbria merged 2 commits intomainfrom
fix/issue-196-stale-call-counter

Conversation

@frankbria
Copy link
Owner

@frankbria frankbria commented Feb 25, 2026

Summary

Fixes #196: Monitor dashboard shows stale API call counter (always 0/100)

  • Replace manual read+increment in execute_claude_code() with the existing increment_call_counter() function which writes to disk immediately
  • Remove the conditional write that only persisted on success (exit_code == 0)
  • Add 2 regression tests verifying the fix

Root Cause

execute_claude_code() read .call_count into a local variable and only wrote it back on success. Failed API calls (timeout, error, API limit) never persisted the increment, so the monitor always showed the stale pre-execution value.

Acceptance Criteria

  • Call counter persisted to disk immediately when incremented (before execution)
  • API calls counted regardless of success/failure
  • Monitor reflects the actual number of API calls made

Test Plan

  • 2 new regression tests in test_cli_modern.bats (568 total, all passing)
  • Full test suite passes with zero failures
  • Structural tests verify the old buggy pattern is removed and the fix is in place

Closes #196

Summary by CodeRabbit

  • Tests

    • Added a rate-limiting test suite (15 new tests) and expanded modernization tests (2 additional tests).
    • Increased total test coverage to 568 tests (from 566).
  • Bug Fixes

…196)

Replace manual read+increment in execute_claude_code() with the existing
increment_call_counter() function which writes to disk before execution.
Remove the conditional write that only persisted on exit code 0.

This fixes the monitor showing stale 0/100 when API calls fail (timeout,
error, API limit) since failed calls still consume quota.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 25, 2026

Walkthrough

Centralized the API call counter increment by replacing inline read+increment in ralph_loop.sh with increment_call_counter(). Updated tests and manifests: added rate-limiting tests and increased total test counts to reflect new tests validating the counter-change.

Changes

Cohort / File(s) Summary
Test Manifest & Counts
CLAUDE.md
Updated test counts: total increased from 566 to 568; added test_rate_limiting.bats entry.
Loop Counter Refactor
ralph_loop.sh
Replaced manual cat/increment and local persistence with a call to increment_call_counter(); removed the conditional success-path file write.
Tests for Counter Behavior
tests/unit/test_cli_modern.bats
Added tests asserting execute_claude_code calls increment_call_counter, that manual increment pattern is removed, and that conditional write-on-success is absent.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Poem

🐰 I hopped through loops and files with glee,

I nudged the counter to write promptly, see —
No more stale zeroes, the monitor sings,
Tests added, tiny carrot-praise it brings,
A rabbit's nod to cleaner counting things. 🥕

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(loop): persist API call counter immediately, not only on success' clearly and specifically describes the main change: replacing conditional persistence with immediate counter persistence.
Linked Issues check ✅ Passed The pull request fully addresses all coding requirements from issue #196: replaces manual counter increment with increment_call_counter() for immediate persistence, removes conditional success-path write, and adds regression tests to verify the fix.
Out of Scope Changes check ✅ Passed All changes are directly scoped to issue #196: ralph_loop.sh refactors counter persistence, test_cli_modern.bats adds regression tests, and CLAUDE.md updates test counts reflecting the new tests added.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/issue-196-stale-call-counter

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

Code Review

✅ Overall Assessment: APPROVED

This is a clean, well-tested fix for issue #196. The change is minimal, targeted, and correctly addresses the root cause.


Code Quality & Best Practices

Excellent:

  • ✅ Reuses existing function instead of adding new code
  • ✅ Follows DRY principle - the increment logic is already tested and validated
  • ✅ Minimal change surface - only removes buggy code, doesn't add complexity
  • ✅ Clear commit title follows conventional commit format

Potential Bugs or Issues

None identified. The fix is straightforward:

Before (buggy):

After (fixed):

The function (ralph_loop.sh:452-461):

  • Reads count from disk
  • Increments it
  • Writes back to disk immediately ()
  • Returns the new count

This ensures the counter is persisted regardless of Claude Code execution success/failure.


Performance Considerations

⚠️ Minor trade-off: Disk I/O on every API call

The fix writes to disk on every loop iteration instead of only on success. However:

  • The file write is a simple integer (tiny)
  • Rate limiting means max 100 writes/hour by default
  • The benefit (accurate monitoring) outweighs this trivial cost

Recommendation: Keep current implementation - the reliability gain is worth the minimal I/O cost.


Security Concerns

None. No security implications.


Test Coverage

Excellent structural tests:

  1. ✅ ****

    • Verifies the new pattern is present ()
    • Verifies the old buggy pattern is removed ()
  2. ✅ ****

    • Verifies the conditional write comment is gone
    • Confirms counter is now persisted before execution

Both tests validate the fix at the structural level, which is appropriate for this bug since:

  • The behavior is simple (disk write vs. no disk write)
  • The function () is already tested elsewhere
  • Regression detection is the primary goal

Documentation

✅ Updated CLAUDE.md:


Summary

Strengths:

  • Minimal, targeted fix
  • Reuses well-tested existing code
  • Comprehensive test coverage for regression prevention
  • Clear documentation updates
  • Addresses the exact issue reported

No blockers. This PR is ready to merge.

@github-actions
Copy link

Code Review

Clean, targeted fix for Issue 196. The root cause analysis in the PR description is accurate — the previous code's conditional write inside if [ $exit_code -eq 0 ] was the direct cause of stale monitor readings.

What's Good

Core fix is correct. Delegating to increment_call_counter() (lines 452-461) is the right approach — that function already does read-increment-write atomically and echoes the new value, so the calls_made local variable remains available for the log message at line 1099. The removal of the conditional write is the key correctness improvement.

Minimal diff. The change is tightly scoped: 3 lines removed from execute_claude_code(), no changes elsewhere in the main logic path. This is exactly the right amount of change for a bug fix.

CLAUDE.md updated. Test count updated from 566 → 568 and the test_cli_modern.bats description correctly references the stale call counter fix.

Minor Observations

Test style is structural, not behavioral. Both new tests use grep to verify code structure (old pattern absent, new pattern present). This is consistent with how other regression tests in test_cli_modern.bats are written (e.g., the ALLOWED_TOOLS and errexit pipeline guard tests), so this is fine for consistency. A future improvement might be a behavioral test that simulates a failed execution and verifies the counter file is updated, but that's not required here.

Counter increments before execution begins. This is a minor semantic shift: the counter now increments before the Claude subprocess is launched, meaning a setup failure (e.g., invalid output_file path) would still count against the rate limit. In practice this edge case is unlikely, and counting attempted calls is arguably more correct than only counting successful ones.

No file locking in increment_call_counter(). The read-modify-write isn't atomic at the OS level, so two simultaneous Ralph instances could produce duplicate counts. This was equally true before this PR, and Ralph is single-instance by design — just noting it's not a regression.

Verdict

Approved. The fix is correct, the diff is minimal, and the tests follow established patterns in the codebase. No blockers.

@github-actions
Copy link

Review posted successfully to PR #197.

Summary:

  • APPROVED - Clean, well-tested fix
  • Excellent code quality (reuses existing function, minimal changes)
  • No bugs or security issues
  • Minor performance trade-off (extra disk I/O) is acceptable for the reliability gain
  • Strong test coverage with regression prevention
  • Documentation properly updated

The PR is ready to merge.

New%20session%20-%202026-02-25T22%3A41%3A06.732Z
opencode session  |  github run

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ralph_loop.sh`:
- Line 1089: Split the declaration and assignment for calls_made to avoid
masking increment_call_counter's exit status: replace "local
calls_made=$(increment_call_counter)" with "local calls_made" followed by
"calls_made=$(increment_call_counter)" and immediately check the command's exit
code ($?): if increment_call_counter fails, log an error via
log_status/update_status (or exit) and set a safe default for calls_made (e.g.,
0) so downstream calls to log_status and update_status receive a valid numeric
value; reference the increment_call_counter function, the calls_made variable,
and the log_status/update_status calls when making the change.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 622acb5 and 2f0f69d.

📒 Files selected for processing (3)
  • CLAUDE.md
  • ralph_loop.sh
  • tests/unit/test_cli_modern.bats

ralph_loop.sh Outdated
local loop_count=$1
local calls_made=$(cat "$CALL_COUNT_FILE" 2>/dev/null || echo "0")
calls_made=$((calls_made + 1))
local calls_made=$(increment_call_counter)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

SC2155: Declare and assign calls_made separately to avoid masking the return value of increment_call_counter.

local calls_made=$(increment_call_counter) causes local to always return 0, silently masking any non-zero exit code from increment_call_counter (e.g., disk write failure on CALL_COUNT_FILE). If the subshell fails, calls_made becomes empty and the downstream log_status call at line 1099 prints "Call /100", and update_status writes an invalid JSON value for calls_made_this_hour.

🛠️ Proposed fix (SC2155)
-    local calls_made=$(increment_call_counter)
+    local calls_made
+    calls_made=$(increment_call_counter)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
local calls_made=$(increment_call_counter)
local calls_made
calls_made=$(increment_call_counter)
🧰 Tools
🪛 Shellcheck (0.11.0)

[warning] 1089-1089: Declare and assign separately to avoid masking return values.

(SC2155)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ralph_loop.sh` at line 1089, Split the declaration and assignment for
calls_made to avoid masking increment_call_counter's exit status: replace "local
calls_made=$(increment_call_counter)" with "local calls_made" followed by
"calls_made=$(increment_call_counter)" and immediately check the command's exit
code ($?): if increment_call_counter fails, log an error via
log_status/update_status (or exit) and set a safe default for calls_made (e.g.,
0) so downstream calls to log_status and update_status receive a valid numeric
value; reference the increment_call_counter function, the calls_made variable,
and the log_status/update_status calls when making the change.

Separate `local calls_made` from the assignment to avoid masking
the return value of increment_call_counter().
@github-actions
Copy link

Code Review

APPROVED

Clean, targeted fix for Issue #196. The SC2155 warning has already been addressed in a follow-up commit (1d8fbc3).

What's Working Well

  • Root cause correctly identified: The conditional write inside if [ exit_code -eq 0 ] prevented counter persistence on failed API calls
  • Minimal, scoped change: Only removes buggy code, reuses existing increment_call_counter() function
  • Good test coverage: Structural tests verify the fix (new pattern present, old patterns absent)
  • Documentation updated: CLAUDE.md test counts and issue references are accurate
  • SC2155 addressed: Declaration split from assignment in follow-up commit

Minor Observations

  • Test methodology is structural: Tests use grep to verify code patterns rather than behavioral tests. This is consistent with existing patterns in test_cli_modern.bats (e.g., ALLOWED_TOOLS, errexit tests) and provides clear regression markers.
  • No explicit error handling: If increment_call_counter() fails, calls_made would be empty. This is outside the scope of this issue and the function is simple enough that failures are unlikely.
  • Performance trade-off: Extra disk I/O on every iteration vs. only on success. This is trivial (100 writes/hour max) and justified by correctness.

Recommendation

This PR is ready to merge. The fix correctly addresses the reported issue, has appropriate regression tests, and follows repository conventions.

@github-actions
Copy link

Review posted to PR #197.

APPROVED - Clean fix for Issue #196. The SC2155 warning was correctly addressed in a follow-up commit, and the change properly fixes the stale counter bug with appropriate test coverage.

New%20session%20-%202026-02-25T22%3A48%3A33.053Z
opencode session  |  github run

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
ralph_loop.sh (1)

452-461: ⚠️ Potential issue | 🔴 Critical

((calls_made++)) exits non-zero when calls_made=0, crashing the script on the first call each hour.

((expr)) returns exit code 1 when the expression evaluates to 0. ((calls_made++)) is a post-increment: when calls_made=0 the expression value is 0, so bash sets $?=1. With set -e inherited by the subshell, the function exits before writing the counter or echoing the result.

Before this PR, local calls_made=$(increment_call_counter) masked the exit code (the SC2155 bug). Now that the SC2155 fix is applied and the assignment is a plain calls_made=$(...), the non-zero subshell exit propagates and set -e in the outer script terminates the loop on the very first API call after every hourly reset (when init_call_tracking writes 0 to CALL_COUNT_FILE).

🐛 Proposed fix
-    ((calls_made++))
+    calls_made=$((calls_made + 1))

$((...)) arithmetic expansion never exits non-zero, so it is safe under set -e.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ralph_loop.sh` around lines 452 - 461, The post-increment using the
arithmetic command ((calls_made++)) can return exit code 1 when the expression
value is 0 and breaks scripts running under set -e; update the increment in the
increment_call_counter function to use arithmetic expansion (e.g., set
calls_made to $((calls_made+1))) so the operation never produces a non-zero exit
status, then continue writing the new value to CALL_COUNT_FILE and echoing it as
before.
♻️ Duplicate comments (1)
ralph_loop.sh (1)

1089-1090: SC2155 fix correctly applied — matches the proposed fix from the previous review.

The split declaration/assignment resolves the shellcheck warning and now correctly propagates increment_call_counter's exit code to the caller.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ralph_loop.sh` around lines 1089 - 1090, The SC2155 warning was addressed by
splitting declaration and assignment so the exit status from
increment_call_counter is preserved; keep the pattern used in ralph_loop.sh by
declaring the variable first (local calls_made) and then assigning via
calls_made=$(increment_call_counter) so that increment_call_counter's exit code
is propagated back to the caller and the shellcheck warning is avoided.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@ralph_loop.sh`:
- Around line 452-461: The post-increment using the arithmetic command
((calls_made++)) can return exit code 1 when the expression value is 0 and
breaks scripts running under set -e; update the increment in the
increment_call_counter function to use arithmetic expansion (e.g., set
calls_made to $((calls_made+1))) so the operation never produces a non-zero exit
status, then continue writing the new value to CALL_COUNT_FILE and echoing it as
before.

---

Duplicate comments:
In `@ralph_loop.sh`:
- Around line 1089-1090: The SC2155 warning was addressed by splitting
declaration and assignment so the exit status from increment_call_counter is
preserved; keep the pattern used in ralph_loop.sh by declaring the variable
first (local calls_made) and then assigning via
calls_made=$(increment_call_counter) so that increment_call_counter's exit code
is propagated back to the caller and the shellcheck warning is avoided.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2f0f69d and 1d8fbc3.

📒 Files selected for processing (1)
  • ralph_loop.sh

@github-actions
Copy link

github-actions bot commented Feb 25, 2026

Review Update — SC2155 Follow-up Commit

The prior reviews covered the core fix thoroughly and correctly approved it. This comment addresses the latest commit (1d8fbc3), which resolves CodeRabbit's actionable SC2155 finding.


SC2155 Fix — Correct and Complete

The latest commit splits the combined local-declaration-with-assignment pattern into two separate lines, which is the correct SC2155 fix. The local builtin always returns 0, so combining declaration with a command substitution assignment would silently swallow a non-zero exit status from increment_call_counter. Splitting them allows error propagation correctly.


Test Validity After the Split

The regression test greps for calls_made='$(increment_call_counter)' (without the local prefix). After the SC2155 split, the second line still matches that pattern, so both tests remain valid with no changes needed.


Minor Observation (Non-blocking)

The same function has other SC2155 candidates that weren't touched in this PR — for example the local timestamp=... line using a command substitution. These are out of scope for this fix and worth tracking separately. The date command is highly unlikely to fail, so the practical risk is low.


Verdict

No issues with the latest commit. The SC2155 concern raised by CodeRabbit is resolved, the tests remain valid, and the core fix from the initial PR is unchanged. Ready to merge.

@frankbria frankbria merged commit 6ff27b4 into main Feb 25, 2026
7 checks passed
@frankbria frankbria deleted the fix/issue-196-stale-call-counter branch February 25, 2026 23:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Monitor dashboard shows stale API call counter (always 0/100)

1 participant