Skip to content

[codex] fix(security): harden sandbox command execution#1416

Open
13ernkastel wants to merge 5 commits intoNVIDIA:mainfrom
13ernkastel:codex/followup-shellquote-sandbox-hardening
Open

[codex] fix(security): harden sandbox command execution#1416
13ernkastel wants to merge 5 commits intoNVIDIA:mainfrom
13ernkastel:codex/followup-shellquote-sandbox-hardening

Conversation

@13ernkastel
Copy link
Copy Markdown

@13ernkastel 13ernkastel commented Apr 3, 2026

Summary

Hardens the open shell-quoting fix in #1392 by tightening the createSandbox() boundary and removing the remaining shell-string dependency from the follow-on command paths.
This keeps the original security fix intact, but makes the code less dependent on per-call quoting and adds behavior-level regression coverage for the override and execution paths.

Related Issue

Follow-up to #1392.

Changes

  • Re-validates sandboxNameOverride with validateName() inside createSandbox() before any command helper is called.
  • Replaces the dashboard readiness probe with the structured OpenShell helper path instead of a raw runCapture() shell string.
  • Adds runFile() to bin/lib/runner.js so scripts can run via argv-style execution without bash -c interpolation.
  • Uses runFile() for setup-dns-proxy.sh in onboarding.
  • Expands regression coverage in test/onboard.test.js, test/runner.test.js, and test/shellquote-sandbox.test.js.

Type of Change

  • Code change for a new feature, bug fix, or refactor.
  • Code change with doc updates.
  • Doc only. Prose changes without code sample modifications.
  • Doc only. Includes code sample changes.

Testing

  • npx prek run --all-files passes (or equivalently make check).
  • npm test passes.
  • make docs builds without warnings. (for doc-only changes)

Targeted checks run:

  • npx vitest run test/runner.test.js test/onboard.test.js test/shellquote-sandbox.test.js
  • npx prek run --files bin/lib/onboard.js bin/lib/runner.js test/onboard.test.js test/runner.test.js test/shellquote-sandbox.test.js
    This reaches an existing unrelated failure in test/usage-notice.test.js during the repo-wide test-cli hook.
  • npx vitest run test/usage-notice.test.js
    Reproduces the same unrelated existing failure: renders url lines as terminal hyperlinks when tty output is available.

Checklist

General

Code Changes

  • Formatters applied — npx prek run --files ... auto-fixed formatting for the touched files before validation.
  • Tests added or updated for new or changed behavior.
  • No secrets, API keys, or credentials committed.
  • Doc pages updated for any user-facing behavior changes (new commands, changed defaults, new features, bug fixes that contradict existing docs).

Doc Changes

  • Follows the style guide. Try running the update-docs agent skill to draft changes while complying with the style guide. For example, prompt your agent with "/update-docs catch up the docs for the new changes I made in this PR."
  • New pages include SPDX license header and frontmatter, if creating a new page.
  • Cross-references and links verified.

Signed-off-by: 13ernkastel LennonCMJ@live.com

Summary by CodeRabbit

  • Bug Fixes

    • Stricter sandbox name validation with clear error messages and re-prompting to prevent invalid names.
    • More reliable dashboard readiness detection for faster startup feedback.
    • Safer DNS proxy setup invocation to reduce command-injection risk.
  • Tests

    • Expanded tests covering name validation, readiness checks, DNS setup, and command-invocation safety.

latenighthackathon and others added 3 commits April 3, 2026 21:05
sandboxName and GATEWAY_NAME are interpolated into shell command
strings passed to run()/runCapture() without shellQuote(), which
is inconsistent with other user-controlled values in the same file
(lines 460, 1438, 3005).

Wrap both values with shellQuote() to prevent shell metacharacter
interpretation. Add regression test that scans onboard.js for any
unquoted sandboxName in run/runCapture calls.

Closes NVIDIA#1391

Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com>
Line-by-line scanning misses multiline run()/runCapture() calls.
Switch to a regex that matches the full call including the template
literal, so violations spanning multiple lines are caught.

Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com>
Builds on NVIDIA#1392 by validating sandboxName overrides at the createSandbox boundary, moving the dashboard readiness check onto the structured OpenShell helper path, and running the DNS proxy setup through argv-style execution instead of bash -c interpolation.

Adds regression coverage for the new runner helper, invalid sandboxNameOverride rejection, and the createSandbox command paths.

Co-authored-by: latenighthackathon <latenighthackathon@users.noreply.github.com>
Signed-off-by: 13ernkastel <LennonCMJ@live.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 3, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4ed886a6-e597-40e1-abb1-dcfb809805a5

📥 Commits

Reviewing files that changed from the base of the PR and between 1f1d6d5 and 99ee3a0.

📒 Files selected for processing (2)
  • bin/lib/onboard.js
  • test/onboard.test.js
🚧 Files skipped from review as they are similar to previous changes (2)
  • test/onboard.test.js
  • bin/lib/onboard.js

📝 Walkthrough

Walkthrough

Adds a new runner helper runFile and refactors onboarding to validate sandbox names at boundaries, stop forcing lowercase, use argument-array execution for scripts, and change readiness probing to a direct curl check; updates and expands tests to assert the new behaviors.

Changes

Cohort / File(s) Summary
Onboard CLI
bin/lib/onboard.js
Removed forced .toLowerCase() for prompted sandbox names; centralized validateName(..., "sandbox name") for overrides and prompts with try/catch and error logging; replaced sentinel readiness probe with runCaptureOpenshell invoking sandbox exec ... curl -sf http://localhost:${CONTROL_UI_PORT}/; switched DNS setup to call the setup script via runFile("bash", [...setup-dns-proxy.sh, GATEWAY_NAME, sandboxName], { ignoreError: true }).
Runner utility
bin/lib/runner.js
Added exported runFile(file, args = [], opts = {}) that calls child_process.spawnSync with argv array (no bash -c), normalizes args, defaults stdio to ["ignore","pipe","pipe"], merges env with process.env, writes redacted output, and respects opts.ignoreError.
Onboard tests
test/onboard.test.js
Reworked mocks to record runFile calls and structured runner entries; tightened assertions to expect argv-quoted OpenShell readiness curl and DNS setup via runFile; added tests ensuring invalid sandboxNameOverride values are rejected pre-execution and that re-prompting uses validated retry values.
Runner unit tests
test/runner.test.js
Added tests that mock spawnSync to assert runFile("bash", [...]) invokes spawn with direct args (no bash -c) and that provided env merges with process.env while preserving PATH.
Static safety checks
test/shellquote-sandbox.test.js
New Vitest file that scans bin/lib/onboard.js for required validateName(...) at creation boundary, asserts DNS proxy invocation uses argv-style runFile("bash",[...]), and detects any run/runCapture template literals embedding ${sandboxName} without shellQuote(sandboxName).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hopped through args and names so neat,
No sneaky shells now split my feet.
I validate, then pass each part,
Curl the dashboard, start the heart.
Tests cheer — a safer, quieter beat. 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and concisely describes the main security hardening changes to sandbox command execution, directly matching the primary objective of tightening sandbox boundaries and reducing shell injection risks.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@13ernkastel 13ernkastel marked this pull request as ready for review April 3, 2026 13:17
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
bin/lib/runner.js (1)

60-75: Collapse the spawn helpers into one internal path.

run(), runInteractive(), and runFile() now repeat the same spawnSync/redaction/exit handling. A shared helper would make future hardening changes much harder to miss in one branch.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/runner.js` around lines 60 - 75, The three functions run(),
runInteractive(), and runFile() duplicate spawnSync/stdio/env/redaction/exit
handling; extract that logic into a single internal helper (e.g., spawnAndHandle
or _spawnSyncWithRedaction) that takes (fileOrCmd, args, opts, stdio) and
performs spawnSync with cwd ROOT, merged env, calls writeRedactedResult(result,
stdio), logs the redacted rendered command on non-zero exit and process.exit,
and returns result; then refactor run(), runInteractive(), and runFile() to call
this helper with the appropriate stdio and ignoreError behavior, removing the
duplicated spawnSync and exit handling from each function.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bin/lib/onboard.js`:
- Around line 2095-2098: The prompt validation and the final boundary check are
out of sync: replace the `sandboxNameOverride || (await
promptValidatedSandboxName())` expression with `sandboxNameOverride ?? (await
promptValidatedSandboxName())` to prevent empty string falling through, and
modify `promptValidatedSandboxName()` to call and return `validateName(...)`
(instead of using the RFC-1123 regex directly) so the interactive retry loop
enforces the same 63-character/lowercase rules as `validateName`; ensure
`validateName` is used for both override and prompted values so failures
re-prompt rather than abort.

---

Nitpick comments:
In `@bin/lib/runner.js`:
- Around line 60-75: The three functions run(), runInteractive(), and runFile()
duplicate spawnSync/stdio/env/redaction/exit handling; extract that logic into a
single internal helper (e.g., spawnAndHandle or _spawnSyncWithRedaction) that
takes (fileOrCmd, args, opts, stdio) and performs spawnSync with cwd ROOT,
merged env, calls writeRedactedResult(result, stdio), logs the redacted rendered
command on non-zero exit and process.exit, and returns result; then refactor
run(), runInteractive(), and runFile() to call this helper with the appropriate
stdio and ignoreError behavior, removing the duplicated spawnSync and exit
handling from each function.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4277283d-a5ab-4474-9d2b-b9e92316bc0b

📥 Commits

Reviewing files that changed from the base of the PR and between 494ecde and 1f1d6d5.

📒 Files selected for processing (5)
  • bin/lib/onboard.js
  • bin/lib/runner.js
  • test/onboard.test.js
  • test/runner.test.js
  • test/shellquote-sandbox.test.js

Signed-off-by: 13ernkastel <LennonCMJ@live.com>
@wscurran wscurran added security Something isn't secure priority: high Important issue that should be resolved in the next release enhancement: feature Use this label to identify requests for new capabilities in NemoClaw. OpenShell Support for OpenShell, a safe, private runtime for autonomous AI agents labels Apr 3, 2026
@wscurran
Copy link
Copy Markdown
Contributor

wscurran commented Apr 3, 2026

✨ Thanks for submitting this pull request, which proposes a way to improve security by hardening sandbox command execution in OpenShell.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement: feature Use this label to identify requests for new capabilities in NemoClaw. OpenShell Support for OpenShell, a safe, private runtime for autonomous AI agents priority: high Important issue that should be resolved in the next release security Something isn't secure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants