Skip to content

Feat/docker sandbox#168

Open
Zelray wants to merge 3 commits intofrankbria:mainfrom
Zelray:feat/docker-sandbox
Open

Feat/docker sandbox#168
Zelray wants to merge 3 commits intofrankbria:mainfrom
Zelray:feat/docker-sandbox

Conversation

@Zelray
Copy link

@Zelray Zelray commented Feb 8, 2026

Implements #74 — Phase 6.1 Local Docker Sandbox Execution
Adds Docker containerization for Ralph, enabling cross-platform support (including Windows via Docker Desktop).
Files added:

Dockerfile — Ubuntu 22.04 + Node 20 + Claude Code CLI + Ralph with wrapper scripts
docker-compose.yml — Three services: interactive shell, autonomous loop, monitor
ralph-docker.ps1 — Windows PowerShell helper with 10 commands (setup, start, monitor, etc.)
lib/sandbox_docker.sh — Bash library for Docker container lifecycle management
tests/unit/test_sandbox_docker.bats — 24 BATS unit tests
docs/docker.md — Full documentation with architecture diagram
.dockerignore — Keeps image clean
.gitattributes — Forces LF line endings for shell scripts

Tested on Windows 11 + Docker Desktop:

Image builds successfully
ralph --help works inside container
CRLF line endings handled automatically via dos2unix in build
Auth persistence via ~/.claude volume mount (supports Claude Max/Pro interactive login)

Key design decisions:

Wrapper scripts instead of symlinks (fixes dirname $0 lib path resolution)
dos2unix in build step (handles Windows Git CRLF conversion transparently)
Non-root ralph user for security
Directories created as root before USER switch

Summary by CodeRabbit

New Features

  • Added Docker containerization support for running Ralph in isolated containers
  • Introduced Docker Compose configuration enabling multi-service orchestration (interactive shell, autonomous loop, monitoring dashboard)
  • Added Windows PowerShell helper script for Docker setup, image building, and container lifecycle management

Documentation

  • New Docker usage guide covering prerequisites, setup, authentication, and troubleshooting

Tests

  • Added unit tests for Docker sandbox functionality

Chores

  • Configured Git attributes and Docker ignore rules

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 8, 2026

Walkthrough

The PR introduces Docker support for Ralph, enabling containerized execution with sandboxed environments. It includes a Dockerfile for building a Ralph image, Docker Compose configuration for multi-service orchestration, Bash and PowerShell utility scripts for container lifecycle management, comprehensive documentation, and unit tests.

Changes

Cohort / File(s) Summary
Docker Infrastructure
.gitattributes, Dockerfile, docker-compose.yml, .dockerignore
Introduces Docker image build configuration with Debian base, Node.js 20, system dependencies, and three services (interactive shell, autonomous loop, monitoring dashboard) with volume mounts for project, config, and credentials.
Documentation
docs/docker.md
Comprehensive guide covering prerequisites, quick-start setup, authentication, Docker Compose usage, Windows PowerShell integration, resource/network configuration, troubleshooting, and best practices.
Unix/Linux Implementation
lib/sandbox_docker.sh
Bash utility providing Docker validation, image/container lifecycle management, volume mounting, credential handling, and orchestration for shell, loop, and monitor execution modes with comprehensive error handling.
Windows Implementation
ralph-docker.ps1
PowerShell script offering CLI commands (setup, build, shell, start, monitor, stop, status, logs, clean) with Docker availability checks, image building, container orchestration, and Windows-specific credential persistence.
Tests
tests/unit/test_sandbox_docker.bats
Unit test suite validating container naming, run-argument construction, Docker availability checks, and configuration defaults across various environment configurations.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant CLI as Bash/PowerShell<br/>CLI
    participant Docker as Docker<br/>Daemon
    participant Container as Ralph<br/>Container
    participant Volumes as Project &<br/>Config Volumes

    User->>CLI: invoke ralph-docker (setup/build/shell/start)
    CLI->>Docker: check Docker daemon status
    Docker-->>CLI: daemon ready
    CLI->>Docker: build/pull Ralph image
    Docker-->>CLI: image ready
    
    alt Interactive Shell
        CLI->>Docker: docker run --tty --interactive
        Docker->>Container: spawn shell process
        Container->>Volumes: mount project, config, credentials
        Volumes-->>Container: volumes ready
        Container-->>User: interactive bash session
    else Autonomous Loop
        CLI->>Docker: docker run --detach
        Docker->>Container: spawn ralph --live process
        Container->>Volumes: mount project, config, credentials
        Volumes-->>Container: volumes ready
        Container->>Container: continuous execution loop
    else Monitor Dashboard
        CLI->>Docker: docker run ralph-monitor
        Docker->>Container: spawn monitoring process
        Container->>Container: display metrics & status
        Container-->>User: dashboard output
    end

    User->>CLI: stop/status/logs commands
    CLI->>Docker: container management
    Docker-->>CLI: status/logs response
    CLI-->>User: formatted output
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 Ralph now hops in Docker's den,

With volumes, shells, and loops again,

PowerShell scripts and Bash so fine,

Three services dance in perfect line! 🐳✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Feat/docker sandbox' is related to the main changeset, which implements Docker-based containerization for Ralph. However, it is overly broad and uses abbreviated/generic phrasing ('Feat/') typical of branch names rather than a clear, descriptive PR title that explains what the Docker sandbox feature does. Consider a more descriptive title like 'Add Docker containerization for cross-platform Ralph execution' to clearly communicate the feature's purpose to reviewers scanning PR history.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@macroscopeapp
Copy link
Contributor

macroscopeapp bot commented Feb 8, 2026

Add Docker sandbox to run Ralph via Ubuntu 22.04 image, compose services, and Windows PowerShell tooling

Introduce a Docker-based workflow with an Ubuntu 22.04 image installing Node.js 20 and @anthropic-ai/claude-code, compose services for shell/loop/monitor, shell helpers in lib for container lifecycle, a Windows PowerShell entrypoint, and unit tests for Docker args and behavior. Key files: Dockerfile, docker-compose.yml, lib/sandbox_docker.sh, ralph-docker.ps1, tests/unit/test_sandbox_docker.bats, docs/docker.md.

📍Where to Start

Start with the Docker control surface in lib/sandbox_docker.sh, focusing on sandbox_docker_build_run_args and the sandbox_docker_run_* functions, then review the image setup in Dockerfile and service wiring in docker-compose.yml.


📊 Macroscope summarized 6adfc0b. 8 files reviewed, 2 issues evaluated, 0 issues filtered, 2 comments posted. View details

# Working directory
args+=("-w" "$DOCKER_WORKSPACE")

echo "${args[@]}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 High

lib/sandbox_docker.sh:111 echo "${args[@]}" loses array boundaries, breaking paths with spaces when callers word-split the output. Consider using declare -p or printf '%s\0' with mapfile -d '', or refactor to set a global array directly instead of returning via stdout.

🚀 Want me to fix this? Reply ex: "fix it for me".

@@ -0,0 +1,15 @@
# Docker ignore for Ralph
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Critical

dockerignore:1 The file should be named .dockerignore (with leading dot) for Docker to recognize it. Without the dot, these exclusion rules won't apply and sensitive data like .git will be included in the build context.

🚀 Want me to fix this? Reply ex: "fix it for me".

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 14

🤖 Fix all issues with AI agents
In `@docker-compose.yml`:
- Line 34: The Docker Compose variable default uses a literal tilde; change any
occurrences of the pattern `${CLAUDE_CONFIG_DIR:-~/.claude}` (and the similar
occurrence at the second mount) to use the HOME environment variable instead,
e.g. `${CLAUDE_CONFIG_DIR:-${HOME}/.claude}`, so Docker Compose will expand to
the actual user home directory; update both instances found in the diff that
reference CLAUDE_CONFIG_DIR (and the matching second mount) accordingly.

In `@Dockerfile`:
- Around line 70-72: The Dockerfile currently appends "|| true" to the RUN bash
./install.sh invocation which masks failures from install.sh (which uses set -e)
and can leave $RALPH_HOME missing expected scripts used by the wrapper scripts;
remove the "|| true" so the build fails fast on any install.sh error, or
alternatively add a post-install validation step after running install.sh that
checks for the presence of the files created by install_scripts() (the scripts
copied into $RALPH_HOME that the wrapper scripts exec) and exit non-zero if any
are missing, ensuring the build fails when install.sh did not produce the
required artifacts.

In `@docs/docker.md`:
- Around line 264-267: The fenced code block showing the .gitattributes example
lacks a language identifier; update the opening fence from ``` to include a
language such as ```text or ```gitattributes so the block (containing "*.sh text
eol=lf" and "*.bats text eol=lf") is marked with a language identifier to
satisfy MD040.
- Around line 184-211: The fenced ASCII diagram in docs/docker.md is missing a
language identifier (causing MD040); open the diagram block (the large
triple-backtick ASCII art) and change the opening fence from ``` to ```text so
the diagram is marked as plain text (keep the closing ``` unchanged), which
satisfies markdownlint and preserves the diagram formatting.

In `@lib/sandbox_docker.sh`:
- Around line 183-196: The docker ps filter uses substring matching which can
stop unrelated containers; update sandbox_docker_stop_all to post-filter the
names list with an anchored grep so only exact container names or names with the
explicit delimiter are returned (e.g., pipe docker ps output into grep -E
"^${DOCKER_CONTAINER_PREFIX}($|-)") and apply the same anchored filtering
approach to the related functions (sandbox_docker_stop_all, sandbox_docker_stop,
status, cleanup_containers) so they only match intended containers by exact name
or by the chosen delimiter-prefixed convention.
- Around line 73-112: The function sandbox_docker_build_run_args currently
echoes the args array which causes word-splitting for values with spaces when
callers reconstruct via unquoted $(...), so change sandbox_docker_build_run_args
to emit null-delimited entries (use printf '%s\0' "${args[@]}") and update each
caller (sandbox_docker_run_loop, sandbox_docker_run_monitor,
sandbox_docker_run_shell) to read that output safely into an array (e.g., use
mapfile -d '' -t args < <(sandbox_docker_build_run_args "$project_dir"
"$container_name") or while IFS= read -r -d '' -n1 entry; do args+=("$entry");
done < <(sandbox_docker_build_run_args ...) ), or alternatively refactor to pass
the array via a nameref; ensure everywhere you invoke docker run you use the
reconstructed array unquoted as "${args[@]}" so spaces are preserved.

In `@ralph-docker.ps1`:
- Around line 84-93: The Test-DockerRunning function currently relies on
try/catch which doesn't catch native executable non-zero exits; run docker info
and then check $LASTEXITCODE (or use Get-Command to detect absence) to determine
success: after invoking docker info (avoid swallowing exit status with piping to
Out-Null), if $LASTEXITCODE -ne 0 write the error message and return $false,
otherwise return $true; alternatively first use Get-Command docker -ErrorAction
SilentlyContinue to handle "not on PATH" vs a stopped daemon and use
$LASTEXITCODE to detect the stopped daemon case.
- Around line 169-170: After running the docker build command in
ralph-docker.ps1, check $LASTEXITCODE immediately and avoid printing the success
message when the build failed; update the block around the docker build and the
Write-Host that prints "[OK] Image built: $ImageName" so that if $LASTEXITCODE
-ne 0 you log an error (or call throw/exit) and do not print the green success
line—this change affects the flow used by Invoke-Build (and transitively
Invoke-Setup, Invoke-Shell, Invoke-Start), so ensure the failure is propagated
(throw/exit) instead of continuing.
- Around line 144-155: After running the docker command that creates the setup
container, check the process exit status (e.g., $LASTEXITCODE) and only print
the success messages when it is zero; if non-zero, write an error (with the exit
code) and exit the script with that code. Apply the same pattern to the docker
invocations inside Invoke-Shell and Invoke-Start: after each docker run/command,
test $LASTEXITCODE, emit a clear failure message and call exit $LASTEXITCODE (or
return/throw appropriately) instead of always printing the "[OK] Setup
complete!" success text.
- Line 64: Line uses PowerShell 7+ operators (?. and ??) causing parse errors in
PS 5.1; replace the single-expression with a PS5-compatible sequence: call
Resolve-Path with -ErrorAction SilentlyContinue into a temp variable (e.g.
$resolved = Resolve-Path -Path $ProjectDir -ErrorAction SilentlyContinue), then
set $ProjectDir = $resolved.Path if $resolved is not $null, otherwise set
$ProjectDir = (Get-Location).Path; reference symbols: $ProjectDir, Resolve-Path,
-ErrorAction SilentlyContinue, $resolved, Get-Location.
- Around line 234-253: Invoke-Monitor does not verify the Docker image exists
before calling docker run, causing a cryptic failure; add the same pre-flight
image existence check used by Invoke-Shell and Invoke-Start (the function or
check used there to validate $ImageName exists) directly before the docker run
block in Invoke-Monitor, and if the image is missing emit the same warning/error
and exit with the same status (e.g., Write-Host/Write-Error and exit 1) so
behavior and messages match the other commands.
- Around line 269-271: The check uses $containers.Count which is inconsistent
across PS versions; wrap the variable in an array to normalize it and update the
condition to use @($containers).Count instead of $containers.Count (or ensure
the assignment to $containers uses @(...)); change the if condition to: if
($null -eq $containers -or @($containers).Count -le 1) { ... } so single-string
results behave like a 1-element array; update references to $containers.Count in
ralph-docker.ps1 accordingly (look for the $containers variable and the if block
containing that Count check).

In `@tests/unit/test_sandbox_docker.bats`:
- Around line 158-163: The test "sandbox_docker: default network mode is bridge"
is re-sourcing sandbox_docker.sh while DOCKER_NETWORK_MODE is already set by
setup(), so it doesn't exercise the default; fix it by unsetting
DOCKER_NETWORK_MODE before sourcing the library (e.g., call unset
DOCKER_NETWORK_MODE just prior to source
"${BATS_TEST_DIRNAME}/../../lib/sandbox_docker.sh") so the file's
${DOCKER_NETWORK_MODE:-bridge} fallback is evaluated and the assertion [[
"${DOCKER_NETWORK_MODE}" == "bridge" ]] actually verifies the default.
- Around line 66-142: Update the tests that call sandbox_docker_build_run_args:
stop capturing its output as a flattened string (e.g.
result=$(sandbox_docker_build_run_args ...)) and instead read the null-delimited
args into an array (e.g. use mapfile -d '' or equivalent to populate an array
from sandbox_docker_build_run_args's stdout). Then assert on specific array
elements (or exact elements existing) rather than substring matches so tests
like the ones checking "/my/project:/workspace", "--name test-container",
"ANTHROPIC_API_KEY=...", "--memory 4g", "--cpus 2", "--network ...", "-w
/workspace", "-it", and "--rm" target exact entries and won't break when
sandbox_docker_build_run_args switches to null-delimited output.
🧹 Nitpick comments (5)
Dockerfile (2)

36-39: NodeSource setup script is fetched and executed at build time without integrity verification.

This is standard Docker practice but worth noting: the setup_20.x script is fetched over HTTPS without checksum verification. For a security-conscious sandbox, consider using the official Node.js Docker image as a multi-stage build source instead, or at least document the trust assumption.


56-57: Pin the @anthropic-ai/claude-code version to ensure reproducible builds.

Unpinned npm packages allow docker build to pull different versions on each run, causing non-reproducible builds and potentially introducing breaking changes. Consider pinning to a specific tested version (latest stable is 1.0.100).

Proposed fix
-RUN npm install -g `@anthropic-ai/claude-code`
+RUN npm install -g `@anthropic-ai/claude-code`@1.0.100

Update the version number to your tested version.

docker-compose.yml (1)

45-63: ralph-loop service with restart: unless-stopped and --rm containers may conflict.

The Dockerfile's ENTRYPOINT is /bin/bash, and this service overrides it with ralph. The ralph wrapper runs exec ralph_loop.sh. If the loop exits (crash, completed work), restart: unless-stopped will restart it. However, the sandbox_docker_build_run_args function in lib/sandbox_docker.sh uses --rm — be aware that the Compose file and the shell library have different lifecycle semantics (Compose restarts vs. CLI removes on exit). This is fine since they're different execution paths, but worth documenting.

lib/sandbox_docker.sh (1)

37-37: Redundant 2>&1&> already redirects both stdout and stderr.

&>/dev/null 2>&1 is equivalent to &>/dev/null. The same applies to line 49.

Proposed fix
-    if ! docker info &>/dev/null 2>&1; then
+    if ! docker info &>/dev/null; then
tests/unit/test_sandbox_docker.bats (1)

148-152: The Docker-not-found mock could be more targeted.

PATH="/usr/bin/nonexistent" removes all commands from PATH, not just docker. This works here because sandbox_docker_check only uses command -v docker, but it's fragile if the function ever gains additional command calls. Consider using a wrapper function or a more selective PATH filter.

As per coding guidelines, including a brief comment explaining the mock strategy (why the entire PATH is replaced) would help future maintainers.

# Mount your project into the container
- ${RALPH_PROJECT_DIR:-.}:/workspace
# Persist Claude Code authentication across container restarts
- ${CLAUDE_CONFIG_DIR:-~/.claude}:/home/ralph/.claude
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

Docker Compose tilde expansion in volume default value variable substitution

💡 Result:

Docker Compose’s variable interpolation supports Bash-like default value syntax (for example, ${VAR:-default}), but it does not perform shell expansions like ~ (home-dir) expansion—so ~/... inside either the volume string or the default part of ${...:-...} stays a literal ~ and can fail with “mount path must be absolute”. [1]

Use $HOME/${HOME} (or an absolute path) instead of ~, including in defaults:

services:
  app:
    volumes:
      - type: bind
        source: ${DATA_DIR:-${HOME}/myapp/data}
        target: /data

Interpolation/default-value forms (:-, -, :?, ?, etc.) are documented here. [1]

If you want a reusable default, another common approach is to set DATA_DIR in a .env file (with an absolute path) and reference ${DATA_DIR} in the Compose file. [1]

Sources

  1. Docker Docs — Compose file interpolation (supported ${VAR:-default} syntax; describes what Compose interpolates and what it doesn’t). [1]

🏁 Script executed:

cat -n docker-compose.yml | sed -n '50,65p'

Repository: frankbria/ralph-claude-code

Length of output: 684


🏁 Script executed:

cat -n docker-compose.yml | sed -n '30,40p'

Repository: frankbria/ralph-claude-code

Length of output: 574


Replace tilde expansion with ${HOME} in default values for Docker Compose variable substitution.

Docker Compose does not expand ~ in variable defaults—${CLAUDE_CONFIG_DIR:-~/.claude} will create a literal ~/.claude path instead of the home directory, causing mount failures. This affects lines 34 and 57.

Proposed fix
-      - ${CLAUDE_CONFIG_DIR:-~/.claude}:/home/ralph/.claude
+      - ${CLAUDE_CONFIG_DIR:-${HOME}/.claude}:/home/ralph/.claude
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- ${CLAUDE_CONFIG_DIR:-~/.claude}:/home/ralph/.claude
- ${CLAUDE_CONFIG_DIR:-${HOME}/.claude}:/home/ralph/.claude
🤖 Prompt for AI Agents
In `@docker-compose.yml` at line 34, The Docker Compose variable default uses a
literal tilde; change any occurrences of the pattern
`${CLAUDE_CONFIG_DIR:-~/.claude}` (and the similar occurrence at the second
mount) to use the HOME environment variable instead, e.g.
`${CLAUDE_CONFIG_DIR:-${HOME}/.claude}`, so Docker Compose will expand to the
actual user home directory; update both instances found in the diff that
reference CLAUDE_CONFIG_DIR (and the matching second mount) accordingly.

Comment on lines +70 to +72
RUN chmod +x /opt/ralph/install.sh \
&& cd /opt/ralph \
&& bash ./install.sh || true \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash

# Find install.sh in the repository
fd -t f 'install.sh' -o fd -t f 'Dockerfile'

# Check directory structure
git ls-files | head -20

Repository: frankbria/ralph-claude-code

Length of output: 668


🏁 Script executed:

#!/bin/bash

# Look for install.sh and Dockerfile
find . -name 'install.sh' -o -name 'Dockerfile' | head -10

Repository: frankbria/ralph-claude-code

Length of output: 96


🏁 Script executed:

#!/bin/bash

# List repository structure to understand layout
ls -la

Repository: frankbria/ralph-claude-code

Length of output: 2640


🏁 Script executed:

#!/bin/bash

# Read install.sh to understand what it does and where it installs scripts
wc -l install.sh && head -100 install.sh

Repository: frankbria/ralph-claude-code

Length of output: 2939


🏁 Script executed:

#!/bin/bash

# Get more of install.sh - focus on installation targets and return logic
tail -50 install.sh

Repository: frankbria/ralph-claude-code

Length of output: 2017


🏁 Script executed:

#!/bin/bash

# Read Dockerfile to see the full context around lines 70-72
cat -n Dockerfile | sed -n '60,90p'

Repository: frankbria/ralph-claude-code

Length of output: 1511


🏁 Script executed:

#!/bin/bash

# Find the main() function in install.sh to understand what it actually does
grep -n "^main()" install.sh
sed -n '100,250p' install.sh

Repository: frankbria/ralph-claude-code

Length of output: 4360


🏁 Script executed:

#!/bin/bash

# Look for the main function implementation
grep -n "main\|create_wrapper\|copy_scripts" install.sh | head -20

Repository: frankbria/ralph-claude-code

Length of output: 140


🏁 Script executed:

#!/bin/bash

# Look at the complete main() function and what calls it
sed -n '306,345p' install.sh

Repository: frankbria/ralph-claude-code

Length of output: 1464


🏁 Script executed:

#!/bin/bash

# Find all function calls to install_scripts, install_ralph_loop, install_setup
grep -n "install_scripts\|install_ralph_loop\|install_setup" install.sh

Repository: frankbria/ralph-claude-code

Length of output: 216


🏁 Script executed:

#!/bin/bash

# Check what install.sh does with set -e and what would happen if dependencies are missing
grep -n "^set -e" install.sh

# Verify that install.sh is supposed to copy scripts to RALPH_HOME
sed -n '108,230p' install.sh | grep -E "cp|RALPH_HOME|SCRIPT_DIR" | head -20

Repository: frankbria/ralph-claude-code

Length of output: 895


🏁 Script executed:

#!/bin/bash

# Check what dependencies are actually installed in Dockerfile
sed -n '20,55p' Dockerfile

Repository: frankbria/ralph-claude-code

Length of output: 1052


|| true silences install.sh errors, risking broken wrapper scripts at runtime.

The install.sh script uses set -e (line 4), meaning any failure causes immediate exit. However, the || true on line 72 swallows that exit code, allowing the build to succeed even if the installation failed. The wrapper scripts created on lines 75–87 depend on scripts being copied to $RALPH_HOME by install_scripts(). If that fails silently, the wrapper scripts will attempt to exec non-existent files.

Although the Dockerfile installs all dependencies that check_dependencies validates (Node.js, git, jq, coreutils), the pattern is dangerous—if any file operation fails (permission errors, full disk, etc.), the build will proceed undetected. Add validation that expected scripts exist in $RALPH_HOME after the install, or remove || true so the build fails fast on error.

Suggested fix
-    && bash ./install.sh || true \
+    && bash ./install.sh \
+    && test -f /home/ralph/.ralph/ralph_loop.sh || { echo "ERROR: install.sh did not create expected scripts"; exit 1; } \
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
RUN chmod +x /opt/ralph/install.sh \
&& cd /opt/ralph \
&& bash ./install.sh || true \
RUN chmod +x /opt/ralph/install.sh \
&& cd /opt/ralph \
&& bash ./install.sh \
&& test -f /home/ralph/.ralph/ralph_loop.sh || { echo "ERROR: install.sh did not create expected scripts"; exit 1; } \
🤖 Prompt for AI Agents
In `@Dockerfile` around lines 70 - 72, The Dockerfile currently appends "|| true"
to the RUN bash ./install.sh invocation which masks failures from install.sh
(which uses set -e) and can leave $RALPH_HOME missing expected scripts used by
the wrapper scripts; remove the "|| true" so the build fails fast on any
install.sh error, or alternatively add a post-install validation step after
running install.sh that checks for the presence of the files created by
install_scripts() (the scripts copied into $RALPH_HOME that the wrapper scripts
exec) and exit non-zero if any are missing, ensuring the build fails when
install.sh did not produce the required artifacts.

Comment on lines +184 to +211
```
┌─────────────────────────────────┐
│ Host Machine │
│ (Windows / macOS / Linux) │
│ │
│ ~/.claude/ ←──── Auth creds │
│ ~/my-project/ ←── Your code │
│ │
│ ┌───────────────────────────┐ │
│ │ Docker Container │ │
│ │ (Ubuntu 22.04) │ │
│ │ │ │
│ │ /home/ralph/.claude ──┐ │ │
│ │ /workspace ──────────┐│ │ │
│ │ ││ │ │
│ │ ralph ←── loop ││ │ │
│ │ ralph-monitor ←─ ui ││ │ │
│ │ claude ←── CLI ││ │ │
│ │ ││ │ │
│ │ Node.js 20 + tmux ││ │ │
│ │ + jq + git + bash 5 ││ │ │
│ └────────────────────────┘│ │ │
│ ▲ bind mounts │ │ │
│ └────────────────┘ │ │
└──────────────────────────────┘ │
Claude API ◄──────────────┘
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add a language identifier to the fenced code block.

The architecture diagram code block is missing a language identifier, flagged by markdownlint (MD040). Use ```text for ASCII art diagrams.

🧰 Tools
🪛 markdownlint-cli2 (0.20.0)

[warning] 184-184: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
In `@docs/docker.md` around lines 184 - 211, The fenced ASCII diagram in
docs/docker.md is missing a language identifier (causing MD040); open the
diagram block (the large triple-backtick ASCII art) and change the opening fence
from ``` to ```text so the diagram is marked as plain text (keep the closing ```
unchanged), which satisfies markdownlint and preserves the diagram formatting.

Comment on lines +264 to +267
```
*.sh text eol=lf
*.bats text eol=lf
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add a language identifier to the fenced code block.

This .gitattributes example block is missing a language identifier (MD040). Use ```text or ```gitattributes.

🧰 Tools
🪛 markdownlint-cli2 (0.20.0)

[warning] 264-264: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
In `@docs/docker.md` around lines 264 - 267, The fenced code block showing the
.gitattributes example lacks a language identifier; update the opening fence
from ``` to include a language such as ```text or ```gitattributes so the block
(containing "*.sh text eol=lf" and "*.bats text eol=lf") is marked with a
language identifier to satisfy MD040.

Comment on lines +73 to +112
sandbox_docker_build_run_args() {
local project_dir="$1"
local container_name="${2:-$(sandbox_docker_container_name)}"
local args=()

args+=("--name" "$container_name")
args+=("-it")
args+=("--rm")

# Volume mounts
args+=("-v" "${project_dir}:${DOCKER_WORKSPACE}")

# Mount Claude auth directory if it exists
local claude_dir="${CLAUDE_CONFIG_DIR:-$HOME/.claude}"
if [[ -d "$claude_dir" ]]; then
args+=("-v" "${claude_dir}:/home/ralph/.claude")
fi

# Environment variables
args+=("-e" "TERM=${TERM:-xterm-256color}")
if [[ -n "${ANTHROPIC_API_KEY:-}" ]]; then
args+=("-e" "ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}")
fi

# Resource limits
if [[ -n "$DOCKER_MEMORY_LIMIT" ]]; then
args+=("--memory" "$DOCKER_MEMORY_LIMIT")
fi
if [[ -n "$DOCKER_CPU_LIMIT" ]]; then
args+=("--cpus" "$DOCKER_CPU_LIMIT")
fi

# Network mode
args+=("--network" "$DOCKER_NETWORK_MODE")

# Working directory
args+=("-w" "$DOCKER_WORKSPACE")

echo "${args[@]}"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Word-splitting bug: echo "${args[@]}" + unquoted command substitution corrupts arguments containing spaces.

sandbox_docker_build_run_args flattens the array via echo (line 111), and callers reconstitute it via unquoted $(...) (lines 124, 142, 159). Any path or value with spaces (e.g., /my project) will be split into separate tokens, breaking the docker run invocation. This is what ShellCheck SC2207 flags.

Use printf with a delimiter and mapfile on the caller side, or pass the array via a nameref.

Proposed fix using printf/mapfile

In sandbox_docker_build_run_args, replace the echo with null-delimited output:

-    echo "${args[@]}"
+    printf '%s\0' "${args[@]}"
 }

In each caller (sandbox_docker_run_loop, sandbox_docker_run_monitor, sandbox_docker_run_shell):

-    run_args=($(sandbox_docker_build_run_args "$project_dir" "$container_name"))
+    local run_args=()
+    mapfile -d '' run_args < <(sandbox_docker_build_run_args "$project_dir" "$container_name")
🤖 Prompt for AI Agents
In `@lib/sandbox_docker.sh` around lines 73 - 112, The function
sandbox_docker_build_run_args currently echoes the args array which causes
word-splitting for values with spaces when callers reconstruct via unquoted
$(...), so change sandbox_docker_build_run_args to emit null-delimited entries
(use printf '%s\0' "${args[@]}") and update each caller
(sandbox_docker_run_loop, sandbox_docker_run_monitor, sandbox_docker_run_shell)
to read that output safely into an array (e.g., use mapfile -d '' -t args <
<(sandbox_docker_build_run_args "$project_dir" "$container_name") or while IFS=
read -r -d '' -n1 entry; do args+=("$entry"); done <
<(sandbox_docker_build_run_args ...) ), or alternatively refactor to pass the
array via a nameref; ensure everywhere you invoke docker run you use the
reconstructed array unquoted as "${args[@]}" so spaces are preserved.

Comment on lines +169 to +170
docker build -t $ImageName $ralphDir
Write-Host " [OK] Image built: $ImageName" -ForegroundColor Green
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Same missing $LASTEXITCODE check after docker build.

If the build fails, the script still prints "[OK] Image built". This is especially problematic since Invoke-Build is called transitively from Invoke-Setup, Invoke-Shell, and Invoke-Start.

Proposed fix
     docker build -t $ImageName $ralphDir
+    if ($LASTEXITCODE -ne 0) {
+        Write-Host "  [FAIL] Docker build failed." -ForegroundColor Red
+        exit 1
+    }
     Write-Host "  [OK] Image built: $ImageName" -ForegroundColor Green
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
docker build -t $ImageName $ralphDir
Write-Host " [OK] Image built: $ImageName" -ForegroundColor Green
docker build -t $ImageName $ralphDir
if ($LASTEXITCODE -ne 0) {
Write-Host " [FAIL] Docker build failed." -ForegroundColor Red
exit 1
}
Write-Host " [OK] Image built: $ImageName" -ForegroundColor Green
🤖 Prompt for AI Agents
In `@ralph-docker.ps1` around lines 169 - 170, After running the docker build
command in ralph-docker.ps1, check $LASTEXITCODE immediately and avoid printing
the success message when the build failed; update the block around the docker
build and the Write-Host that prints "[OK] Image built: $ImageName" so that if
$LASTEXITCODE -ne 0 you log an error (or call throw/exit) and do not print the
green success line—this change affects the flow used by Invoke-Build (and
transitively Invoke-Setup, Invoke-Shell, Invoke-Start), so ensure the failure is
propagated (throw/exit) instead of continuing.

Comment on lines +234 to +253
function Invoke-Monitor {
Write-RalphHeader
if (-not (Test-DockerRunning)) { exit 1 }

# Check if ralph-loop is running
$running = docker ps --filter "name=${ContainerPrefix}-loop" --format "{{.Names}}" 2>$null
if ($null -eq $running -or $running -eq "") {
Write-Host " WARNING: Ralph loop is not running." -ForegroundColor Yellow
Write-Host " Start it first with: .\ralph-docker.ps1 start" -ForegroundColor Gray
Write-Host ""
}

docker run -it --rm `
--name "${ContainerPrefix}-monitor" `
-v "${ProjectDir}:/workspace" `
-e "TERM=xterm-256color" `
-w /workspace `
--entrypoint ralph-monitor `
$ImageName
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Invoke-Monitor skips the image-existence check that Invoke-Shell and Invoke-Start perform.

If the image hasn't been built yet, docker run will fail with a cryptic error. Add the same pre-flight check for consistency.

Proposed fix
 function Invoke-Monitor {
     Write-RalphHeader
     if (-not (Test-DockerRunning)) { exit 1 }
+    if (-not (Test-ImageExists)) {
+        Write-Host "  Image not found. Building first..." -ForegroundColor Yellow
+        Invoke-Build
+    }
 
     # Check if ralph-loop is running
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function Invoke-Monitor {
Write-RalphHeader
if (-not (Test-DockerRunning)) { exit 1 }
# Check if ralph-loop is running
$running = docker ps --filter "name=${ContainerPrefix}-loop" --format "{{.Names}}" 2>$null
if ($null -eq $running -or $running -eq "") {
Write-Host " WARNING: Ralph loop is not running." -ForegroundColor Yellow
Write-Host " Start it first with: .\ralph-docker.ps1 start" -ForegroundColor Gray
Write-Host ""
}
docker run -it --rm `
--name "${ContainerPrefix}-monitor" `
-v "${ProjectDir}:/workspace" `
-e "TERM=xterm-256color" `
-w /workspace `
--entrypoint ralph-monitor `
$ImageName
}
function Invoke-Monitor {
Write-RalphHeader
if (-not (Test-DockerRunning)) { exit 1 }
if (-not (Test-ImageExists)) {
Write-Host " Image not found. Building first..." -ForegroundColor Yellow
Invoke-Build
}
# Check if ralph-loop is running
$running = docker ps --filter "name=${ContainerPrefix}-loop" --format "{{.Names}}" 2>$null
if ($null -eq $running -or $running -eq "") {
Write-Host " WARNING: Ralph loop is not running." -ForegroundColor Yellow
Write-Host " Start it first with: .\ralph-docker.ps1 start" -ForegroundColor Gray
Write-Host ""
}
docker run -it --rm `
--name "${ContainerPrefix}-monitor" `
-v "${ProjectDir}:/workspace" `
-e "TERM=xterm-256color" `
-w /workspace `
--entrypoint ralph-monitor `
$ImageName
}
🤖 Prompt for AI Agents
In `@ralph-docker.ps1` around lines 234 - 253, Invoke-Monitor does not verify the
Docker image exists before calling docker run, causing a cryptic failure; add
the same pre-flight image existence check used by Invoke-Shell and Invoke-Start
(the function or check used there to validate $ImageName exists) directly before
the docker run block in Invoke-Monitor, and if the image is missing emit the
same warning/error and exit with the same status (e.g., Write-Host/Write-Error
and exit 1) so behavior and messages match the other commands.

Comment on lines +269 to +271
if ($null -eq $containers -or $containers.Count -le 1) {
Write-Host " No Ralph containers found." -ForegroundColor Gray
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

$containers.Count behaves differently across PowerShell versions.

When docker ps --format "table ..." returns a single string (header only), .Count is $null on PowerShell 5.1 (strings don't have a .Count property) but 1 on PowerShell 7. Wrapping in @() ensures consistent array behavior:

Proposed fix
-    $containers = docker ps -a --filter "name=${ContainerPrefix}" --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}" 2>$null
-    if ($null -eq $containers -or $containers.Count -le 1) {
+    $containers = @(docker ps -a --filter "name=${ContainerPrefix}" --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}" 2>$null)
+    if ($containers.Count -le 1) {
🤖 Prompt for AI Agents
In `@ralph-docker.ps1` around lines 269 - 271, The check uses $containers.Count
which is inconsistent across PS versions; wrap the variable in an array to
normalize it and update the condition to use @($containers).Count instead of
$containers.Count (or ensure the assignment to $containers uses @(...)); change
the if condition to: if ($null -eq $containers -or @($containers).Count -le 1) {
... } so single-string results behave like a 1-element array; update references
to $containers.Count in ralph-docker.ps1 accordingly (look for the $containers
variable and the if block containing that Count check).

Comment on lines +66 to +142
@test "sandbox_docker_build_run_args: includes project dir volume" {
result=$(sandbox_docker_build_run_args "/my/project" "test-container")
[[ "$result" == *"/my/project:/workspace"* ]]
}

@test "sandbox_docker_build_run_args: includes container name" {
result=$(sandbox_docker_build_run_args "/my/project" "test-container")
[[ "$result" == *"--name test-container"* ]]
}

@test "sandbox_docker_build_run_args: includes TERM env" {
export TERM="xterm-256color"
result=$(sandbox_docker_build_run_args "/my/project" "test-container")
[[ "$result" == *"TERM="* ]]
}

@test "sandbox_docker_build_run_args: includes API key when set" {
export ANTHROPIC_API_KEY="sk-test-key"
result=$(sandbox_docker_build_run_args "/my/project" "test-container")
[[ "$result" == *"ANTHROPIC_API_KEY=sk-test-key"* ]]
unset ANTHROPIC_API_KEY
}

@test "sandbox_docker_build_run_args: excludes API key when unset" {
unset ANTHROPIC_API_KEY
result=$(sandbox_docker_build_run_args "/my/project" "test-container")
[[ "$result" != *"ANTHROPIC_API_KEY"* ]]
}

@test "sandbox_docker_build_run_args: includes memory limit when set" {
export DOCKER_MEMORY_LIMIT="4g"
result=$(sandbox_docker_build_run_args "/my/project" "test-container")
[[ "$result" == *"--memory 4g"* ]]
unset DOCKER_MEMORY_LIMIT
}

@test "sandbox_docker_build_run_args: excludes memory limit when unset" {
unset DOCKER_MEMORY_LIMIT
export DOCKER_MEMORY_LIMIT=""
result=$(sandbox_docker_build_run_args "/my/project" "test-container")
[[ "$result" != *"--memory"* ]]
}

@test "sandbox_docker_build_run_args: includes CPU limit when set" {
export DOCKER_CPU_LIMIT="2"
result=$(sandbox_docker_build_run_args "/my/project" "test-container")
[[ "$result" == *"--cpus 2"* ]]
unset DOCKER_CPU_LIMIT
}

@test "sandbox_docker_build_run_args: includes network mode" {
export DOCKER_NETWORK_MODE="host"
result=$(sandbox_docker_build_run_args "/my/project" "test-container")
[[ "$result" == *"--network host"* ]]
export DOCKER_NETWORK_MODE="bridge"
}

@test "sandbox_docker_build_run_args: defaults to bridge network" {
export DOCKER_NETWORK_MODE="bridge"
result=$(sandbox_docker_build_run_args "/my/project" "test-container")
[[ "$result" == *"--network bridge"* ]]
}

@test "sandbox_docker_build_run_args: includes working directory" {
result=$(sandbox_docker_build_run_args "/my/project" "test-container")
[[ "$result" == *"-w /workspace"* ]]
}

@test "sandbox_docker_build_run_args: includes interactive and tty flags" {
result=$(sandbox_docker_build_run_args "/my/project" "test-container")
[[ "$result" == *"-it"* ]]
}

@test "sandbox_docker_build_run_args: includes rm flag" {
result=$(sandbox_docker_build_run_args "/my/project" "test-container")
[[ "$result" == *"--rm"* ]]
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Tests rely on string-flattened output from sandbox_docker_build_run_args — they'll break when the word-splitting bug is fixed.

All run-args tests capture the function's output as a flat string and check for substring matches. If the function is fixed to use null-delimited output (per the review comment on lib/sandbox_docker.sh), these tests will need to be updated to deserialize accordingly.

Consider proactively updating these tests to use mapfile -d '' to capture the args array and then assert on individual elements, which will also make the assertions more precise (no false positives from substring overlaps).

🤖 Prompt for AI Agents
In `@tests/unit/test_sandbox_docker.bats` around lines 66 - 142, Update the tests
that call sandbox_docker_build_run_args: stop capturing its output as a
flattened string (e.g. result=$(sandbox_docker_build_run_args ...)) and instead
read the null-delimited args into an array (e.g. use mapfile -d '' or equivalent
to populate an array from sandbox_docker_build_run_args's stdout). Then assert
on specific array elements (or exact elements existing) rather than substring
matches so tests like the ones checking "/my/project:/workspace", "--name
test-container", "ANTHROPIC_API_KEY=...", "--memory 4g", "--cpus 2", "--network
...", "-w /workspace", "-it", and "--rm" target exact entries and won't break
when sandbox_docker_build_run_args switches to null-delimited output.

Comment on lines +158 to +163
@test "sandbox_docker: default network mode is bridge" {
# Reset to defaults
source "${BATS_TEST_DIRNAME}/../../lib/sandbox_docker.sh"
# The default is set at source time; re-check with clean env
[[ "${DOCKER_NETWORK_MODE}" == "bridge" ]]
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Default network mode test doesn't actually test the default — re-sourcing reuses the already-set variable.

The setup() function already sources the library, which sets DOCKER_NETWORK_MODE to "bridge". Re-sourcing on line 160 sees the existing value and keeps it (${DOCKER_NETWORK_MODE:-bridge}). To properly test the default, unset the variable before re-sourcing:

Proposed fix
 `@test` "sandbox_docker: default network mode is bridge" {
-    # Reset to defaults
+    # Unset to verify the library applies the correct default
+    unset DOCKER_NETWORK_MODE
     source "${BATS_TEST_DIRNAME}/../../lib/sandbox_docker.sh"
-    # The default is set at source time; re-check with clean env
     [[ "${DOCKER_NETWORK_MODE}" == "bridge" ]]
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@test "sandbox_docker: default network mode is bridge" {
# Reset to defaults
source "${BATS_TEST_DIRNAME}/../../lib/sandbox_docker.sh"
# The default is set at source time; re-check with clean env
[[ "${DOCKER_NETWORK_MODE}" == "bridge" ]]
}
`@test` "sandbox_docker: default network mode is bridge" {
# Unset to verify the library applies the correct default
unset DOCKER_NETWORK_MODE
source "${BATS_TEST_DIRNAME}/../../lib/sandbox_docker.sh"
[[ "${DOCKER_NETWORK_MODE}" == "bridge" ]]
}
🤖 Prompt for AI Agents
In `@tests/unit/test_sandbox_docker.bats` around lines 158 - 163, The test
"sandbox_docker: default network mode is bridge" is re-sourcing
sandbox_docker.sh while DOCKER_NETWORK_MODE is already set by setup(), so it
doesn't exercise the default; fix it by unsetting DOCKER_NETWORK_MODE before
sourcing the library (e.g., call unset DOCKER_NETWORK_MODE just prior to source
"${BATS_TEST_DIRNAME}/../../lib/sandbox_docker.sh") so the file's
${DOCKER_NETWORK_MODE:-bridge} fallback is evaluated and the assertion [[
"${DOCKER_NETWORK_MODE}" == "bridge" ]] actually verifies the default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant