Add pre-execution shell command analysis by jamiepine · Pull Request #534 · spacedriveapp/spacebot

jamiepine · 2026-04-04T11:26:33Z

Summary

Adds a shell_analysis module that parses and categorizes shell commands before execution, assessing risk level, detecting dangerous patterns, and emitting UX metadata (duration hints, collapsed_by_default, expects_no_output)
Integrates the analyzer into the shell tool — risky commands are blocked pending confirmation, and ShellOutput now includes an analysis field for downstream UI/worker logic
Updates the shell tool prompt description and documents the mitigation for the verbose-output production failure case

Test plan

Verify safe commands (e.g. ls, cargo build) execute normally with correct analysis metadata
Verify risky/destructive commands (e.g. rm -rf /) are blocked with a confirmation error
Verify expects_no_output commands show Done instead of [No output] on success
Verify existing shell tool tests still pass
Review ShellOutput.analysis fields are populated correctly for various command categories

🤖 Generated with Claude Code

Introduces a command analysis layer that categorizes shell commands, assesses risk, detects patterns, and emits metadata (category, risk level, duration hint, UX flags) before execution. Risky commands are blocked pending confirmation. Downstream UI can use the analysis to collapse verbose output and render silent successes cleanly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-04-04T11:26:46Z

Walkthrough

This PR introduces pre-execution shell command analysis by adding a new ShellAnalyzer that categorizes commands, detects security patterns, assesses risk levels, and computes execution metadata before running sandboxed shell operations. Analysis results are returned in ShellOutput to enable downstream UI handling of command visibility and output behavior.

Changes

Cohort / File(s)	Summary
Documentation `docs/design-docs/production-worker-failures.md`, `prompts/en/tools/shell_description.md.j2`	Updated design docs and shell tool description to describe command pre-execution analysis behavior, including metadata emission and command validation before execution.
Module Structure `src/tools.rs`, `src/tools/shell_analysis.rs`	Added `shell_analysis` module with exports of `CommandAnalysis`, `CommandCategory`, `DetectedPattern`, `DurationHint`, `PatternType`, and `RiskLevel` types for external use.
Analysis Types & Parsing `src/tools/shell_analysis/types.rs`, `src/tools/shell_analysis/parser.rs`	Defined serializable data models for analysis results (`CommandAnalysis`, `CommandCategory`, `RiskLevel`, `DurationHint`, `PatternType`, `DetectedPattern`) and implemented quote-aware shell command parsing with operator recognition, word splitting, and path normalization.
Analysis Detectors `src/tools/shell_analysis/categorizer.rs`, `src/tools/shell_analysis/security.rs`	Implemented command categorization based on command type (search/read/list/write/destructive/network/silent) with output redirection awareness; added security pattern detection for command substitution, obfuscation, git commit injection, IFS injection, newlines, environment access, and exfiltration.
Analyzer Orchestration `src/tools/shell_analysis/analyzer.rs`	Introduced `ShellAnalyzer` that chains parsing, categorization, and security detection; computes derived fields including risk level, duration hints, path traversal detection, and confirmation requirements for destructive/suspicious commands.
Shell Tool Integration `src/tools/shell.rs`	Added `ShellAnalyzer` field to `ShellTool`, extended `ShellOutput` with `analysis` field, modified execution flow to analyze commands before running and return error if confirmation is required, updated output formatting to show `"Done"` for silent successful commands.
Formatting Adjustments `src/agent/channel.rs`, `src/config/load.rs`	Restructured string concatenation and macro invocation formatting without logic changes.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Process sandbox: kernel-enforced filesystem containment for shell/exec #188: Modifies src/tools/shell.rs behavior and ShellTool structure; related through changes to the same shell tool implementation.
refactor: merge exec tool into shell, add per-command env vars #360: Modifies src/tools/shell.rs and public types in the shell tool module; related through overlapping struct/module changes.
fix: stop worker secret scans from killing internal work #297: Modifies src/tools/shell.rs working directory resolution and sandbox handling; related through pre-execution analysis dependency on resolved working_dir.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 35.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Add pre-execution shell command analysis' directly and clearly summarizes the main change: introducing a new shell_analysis module for analyzing commands before execution.
Description check	✅ Passed	The PR description clearly describes the main changes: adding shell_analysis module for command parsing/categorization, integrating it into the shell tool with risk assessment and UX metadata, and updating documentation.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch command-analysis

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

tembo · 2026-04-04T11:31:36Z

src/tools/shell.rs

            }
        }

+        let analysis = self.analyzer.analyze(&args.command, &working_dir);


Heads-up: analysis.requires_confirmation currently hard-blocks execution, but ShellArgs doesn't include any confirm/ack flag. If the intention is “ask user then re-run”, we probably need an explicit arg (default false) or a structured “needs_confirmation” error that carries analysis so the caller can display details and retry.

tembo · 2026-04-04T11:31:36Z

src/tools/shell_analysis/security.rs

+            continue;
+        }
+
+        if character == '\r' && !in_double_quote {


This flags carriage returns inside single quotes as risky (the condition only checks !in_double_quote). Probably want to treat \r like \n and ignore it in both quote modes.

Suggested change

if character == '\r' && !in_double_quote {

if character == '\r' && !in_single_quote && !in_double_quote {

patterns.push(pattern(

PatternType::CarriageReturn,

"Command contains a carriage return outside quotes.",

));

continue;

}

tembo · 2026-04-04T11:31:36Z

src/tools/shell_analysis/parser.rs

+}
+
+fn base_command(words: &[String]) -> Option<String> {
+    let command_word = command_words(words).first()?;


Potential bypass: base_command is derived from the first command word, so wrappers like sudo rm -rf … / env rm -rf … get categorized as sudo/env (likely Other) and won't trigger destructive confirmation. Might be worth skipping common wrapper commands (+ their flags) to recover the underlying command for analysis.

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/tools/shell_analysis/categorizer.rs`:
- Around line 31-46: DESTRUCTIVE_COMMANDS currently misses "rmdir" so rmdir is
classified as SILENT and skips confirmation; update the DESTRUCTIVE_COMMANDS
LazyLock initialization to include "rmdir" in the HashSet and also remove
"rmdir" from the SILENT_COMMANDS HashSet to avoid conflicting classification;
modify the static DESTRUCTIVE_COMMANDS and SILENT_COMMANDS definitions (refer to
symbols DESTRUCTIVE_COMMANDS and SILENT_COMMANDS in categorizer.rs) accordingly
so rmdir is treated as CommandCategory::Destructive.
- Around line 166-223: segment_semantics incorrectly assumes the subcommand is
words.get(1) and misses cases where wrappers (e.g., "env") or command-global
options (e.g., "-C", "--context") appear; add a helper (e.g.,
normalize_command_words) that takes the output of command_words(&segment.words)
and strips leading wrapper executables ("env", etc.), environment assignments,
and any leading global-style options (short/long flags and known global flags
like -C/--context) until the real subcommand token is reached; then replace the
current subcommand derivation (words.get(1).map(String::as_str)) with one based
on the normalized list and use that normalized list for checks in the "git",
"docker", "chmod" branches (and any flag helpers like
recursive_flag_present/force_flag_present/long_flag_present as needed).

In `@src/tools/shell.rs`:
- Around line 241-253: The analyzer is blocking "risky" commands with no way to
proceed; add a confirmation pathway by adding a confirm: bool field to ShellArgs
(or equivalent caller payload) and update the check in the function that calls
self.analyzer.analyze(&args.command, &working_dir) to only return the ShellError
when requires_confirmation is true AND args.confirm is false; if args.confirm is
true, allow execution to continue (or set a clear pending state), and ensure any
user-facing messages reflect that confirmation was used (use ShellError only for
unconfirmed attempts).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: caaa9a9b-ab61-425c-9be5-e819a6838369

📥 Commits

Reviewing files that changed from the base of the PR and between fb5c0f3 and 0268b9b.

📒 Files selected for processing (12)

docs/design-docs/production-worker-failures.md
prompts/en/tools/shell_description.md.j2
src/agent/channel.rs
src/config/load.rs
src/tools.rs
src/tools/shell.rs
src/tools/shell_analysis.rs
src/tools/shell_analysis/analyzer.rs
src/tools/shell_analysis/categorizer.rs
src/tools/shell_analysis/parser.rs
src/tools/shell_analysis/security.rs
src/tools/shell_analysis/types.rs

coderabbitai · 2026-04-04T11:38:37Z

src/tools/shell_analysis/analyzer.rs

+    fn detect_outside_workspace_paths(
+        &self,
+        parsed: &ParsedCommand,
+        working_dir: &Path,
+    ) -> Vec<DetectedPattern> {
+        for segment in parsed.executable_segments() {
+            let words = command_words(&segment.words);
+            for word in words.iter().skip(1) {
+                if let Some(path) = resolve_candidate_path(working_dir, word)
+                    && !path.starts_with(&self.workspace)
+                {
+                    return vec![DetectedPattern {
+                        pattern_type: PatternType::OutsideWorkspacePath,
+                        description: format!(
+                            "Command references a path outside the workspace: {word}"
+                        ),
+                        position: None,
+                    }];
+                }
+            }
+        }
+
+        for segment in parsed.redirect_targets() {
+            for word in &segment.words {
+                if let Some(path) = resolve_candidate_path(working_dir, word)
+                    && !path.starts_with(&self.workspace)
+                {
+                    return vec![DetectedPattern {
+                        pattern_type: PatternType::OutsideWorkspacePath,
+                        description: format!(
+                            "Command redirects to a path outside the workspace: {word}"
+                        ),
+                        position: None,
+                    }];
+                }
+            }
+        }
+
+        Vec::new()


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, check the structure of the shell_analysis module fd -t f "analyzer.rs" src/tools/shell_analysis/

Repository: spacedriveapp/spacebot

Length of output: 102

🏁 Script executed:

# Read the analyzer.rs file to understand resolve_candidate_path and the full implementation cat -n src/tools/shell_analysis/analyzer.rs | head -300

Repository: spacedriveapp/spacebot

Length of output: 12190

🏁 Script executed:

# Also check if there are related files that might contain resolve_candidate_path rg "resolve_candidate_path" src/tools/shell_analysis/ -A 10 -B 2

Repository: spacedriveapp/spacebot

Length of output: 3109

🏁 Script executed:

# Check lines 235-263 specifically mentioned in the comment sed -n '235,263p' src/tools/shell_analysis/analyzer.rs

Repository: spacedriveapp/spacebot

Length of output: 796

Special-case device sinks in the outside-workspace guard; treat unresolved home/env expansions in write positions as confirmation-worthy.

resolve_candidate_path() rejects ~ and $-prefixed paths, but accepts absolute paths like /dev/null. This causes benign redirects such as 2>/dev/null to be flagged as dangerous off-workspace writes and require confirmation. Meanwhile, real off-workspace writes using ~/backup.txt or $HOME/backup.txt are silently ignored because they don't resolve. The guard should whitelist safe device sinks (/dev/null, /dev/stdout, /dev/stderr) and instead treat unresolved expansions in write/redirect positions as confirmation-worthy rather than ignoring them.

coderabbitai · 2026-04-04T11:38:37Z

src/tools/shell_analysis/categorizer.rs

+static DESTRUCTIVE_COMMANDS: LazyLock<HashSet<&'static str>> =
+    LazyLock::new(|| HashSet::from(["dd", "mkfs", "rm", "shred"]));
+
+static NETWORK_COMMANDS: LazyLock<HashSet<&'static str>> = LazyLock::new(|| {
+    HashSet::from([
+        "apt", "apt-get", "brew", "bun", "curl", "ftp", "npm", "pip", "pip3", "pnpm", "rsync",
+        "scp", "sftp", "ssh", "telnet", "wget", "yarn",
+    ])
+});
+
+static SILENT_COMMANDS: LazyLock<HashSet<&'static str>> = LazyLock::new(|| {
+    HashSet::from([
+        "cd", "chmod", "chgrp", "chown", "cp", "export", "ln", "mkdir", "mv", "rm", "rmdir",
+        "touch", "unset", "wait",
+    ])
+});


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's check the file structure and locate the categorizer find . -name "categorizer.rs" -type f

Repository: spacedriveapp/spacebot

Length of output: 107

🏁 Script executed:

# Read the categorizer.rs file to understand the current implementation wc -l src/tools/shell_analysis/categorizer.rs

Repository: spacedriveapp/spacebot

Length of output: 109

🏁 Script executed:

# Get the full content of the categorizer file to understand the structure cat -n src/tools/shell_analysis/categorizer.rs | head -250

Repository: spacedriveapp/spacebot

Length of output: 9657

🏁 Script executed:

# Search for RiskLevel and CommandCategory definitions rg "enum RiskLevel|enum CommandCategory" -A 10

Repository: spacedriveapp/spacebot

Length of output: 1196

🏁 Script executed:

# Check if there's any logic that handles commands appearing in multiple sets rg "SILENT_COMMANDS|DESTRUCTIVE_COMMANDS" -B 2 -A 2

Repository: spacedriveapp/spacebot

Length of output: 1467

Add rmdir to DESTRUCTIVE_COMMANDS.

rmdir currently classifies as CommandCategory::Silent because it only appears in SILENT_COMMANDS, not DESTRUCTIVE_COMMANDS. This causes it to bypass the confirmation gate for destructive operations despite permanently removing filesystem state. The fix is straightforward:

Fix

static DESTRUCTIVE_COMMANDS: LazyLock<HashSet<&'static str>> = - LazyLock::new(|| HashSet::from(["dd", "mkfs", "rm", "shred"])); + LazyLock::new(|| HashSet::from(["dd", "mkfs", "rm", "rmdir", "shred"]));

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

static DESTRUCTIVE_COMMANDS: LazyLock<HashSet<&'static str>> =

LazyLock::new(|| HashSet::from(["dd", "mkfs", "rm", "shred"]));

static NETWORK_COMMANDS: LazyLock<HashSet<&'static str>> = LazyLock::new(|| {

HashSet::from([

"apt", "apt-get", "brew", "bun", "curl", "ftp", "npm", "pip", "pip3", "pnpm", "rsync",

"scp", "sftp", "ssh", "telnet", "wget", "yarn",

])

});

static SILENT_COMMANDS: LazyLock<HashSet<&'static str>> = LazyLock::new(|| {

HashSet::from([

"cd", "chmod", "chgrp", "chown", "cp", "export", "ln", "mkdir", "mv", "rm", "rmdir",

"touch", "unset", "wait",

])

});

static DESTRUCTIVE_COMMANDS: LazyLock<HashSet<&'static str>> =

LazyLock::new(|| HashSet::from(["dd", "mkfs", "rm", "rmdir", "shred"]));

static NETWORK_COMMANDS: LazyLock<HashSet<&'static str>> = LazyLock::new(|| {

HashSet::from([

"apt", "apt-get", "brew", "bun", "curl", "ftp", "npm", "pip", "pip3", "pnpm", "rsync",

"scp", "sftp", "ssh", "telnet", "wget", "yarn",

])

});

static SILENT_COMMANDS: LazyLock<HashSet<&'static str>> = LazyLock::new(|| {

HashSet::from([

"cd", "chmod", "chgrp", "chown", "cp", "export", "ln", "mkdir", "mv", "rm", "rmdir",

"touch", "unset", "wait",

])

});

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/tools/shell_analysis/categorizer.rs` around lines 31 - 46, DESTRUCTIVE_COMMANDS currently misses "rmdir" so rmdir is classified as SILENT and skips confirmation; update the DESTRUCTIVE_COMMANDS LazyLock initialization to include "rmdir" in the HashSet and also remove "rmdir" from the SILENT_COMMANDS HashSet to avoid conflicting classification; modify the static DESTRUCTIVE_COMMANDS and SILENT_COMMANDS definitions (refer to symbols DESTRUCTIVE_COMMANDS and SILENT_COMMANDS in categorizer.rs) accordingly so rmdir is treated as CommandCategory::Destructive.

coderabbitai · 2026-04-04T11:38:37Z

src/tools/shell_analysis/categorizer.rs

+fn segment_semantics(
+    segment: &crate::tools::shell_analysis::parser::ParsedSegment,
+) -> CommandSemantics {
+    let mut semantics = CommandSemantics::default();
+    let Some(base_command) = segment.base_command.as_deref() else {
+        return semantics;
+    };
+
+    if SEMANTIC_NEUTRAL_COMMANDS.contains(base_command) {
+        semantics.is_neutral = true;
+        return semantics;
+    }
+
+    semantics.is_search = SEARCH_COMMANDS.contains(base_command);
+    semantics.is_read = READ_COMMANDS.contains(base_command);
+    semantics.is_list = LIST_COMMANDS.contains(base_command);
+    semantics.is_write = WRITE_COMMANDS.contains(base_command);
+    semantics.is_destructive = DESTRUCTIVE_COMMANDS.contains(base_command);
+    semantics.is_network = NETWORK_COMMANDS.contains(base_command);
+    semantics.is_silent = SILENT_COMMANDS.contains(base_command);
+
+    let words = command_words(&segment.words);
+    let subcommand = words.get(1).map(String::as_str);
+
+    match base_command {
+        "chmod" => {
+            semantics.is_write = true;
+            semantics.is_silent = true;
+            if recursive_flag_present(words) {
+                semantics.is_destructive = true;
+            }
+        }
+        "chgrp" | "chown" => {
+            semantics.is_write = true;
+            semantics.is_silent = true;
+        }
+        "docker" => {
+            if matches!(subcommand, Some("build" | "compose" | "pull" | "push")) {
+                semantics.is_network = true;
+            }
+        }
+        "git" => match subcommand {
+            Some("checkout" | "switch") => {
+                semantics.is_silent = true;
+            }
+            Some("clean") if force_flag_present(words) => {
+                semantics.is_destructive = true;
+                semantics.is_silent = true;
+            }
+            Some("clone" | "fetch" | "pull" | "push" | "submodule") => {
+                semantics.is_network = true;
+            }
+            Some("reset") if long_flag_present(words, "--hard") => {
+                semantics.is_destructive = true;
+                semantics.is_silent = true;
+            }
+            _ => {}
+        },


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the categorizer.rs file to understand the full context wc -l src/tools/shell_analysis/categorizer.rs

Repository: spacedriveapp/spacebot

Length of output: 109

🏁 Script executed:

# Read the segment_semantics function and surrounding context head -230 src/tools/shell_analysis/categorizer.rs | tail -100

Repository: spacedriveapp/spacebot

Length of output: 3433

🏁 Script executed:

# Let's also look at the beginning to understand the constants head -100 src/tools/shell_analysis/categorizer.rs

Repository: spacedriveapp/spacebot

Length of output: 3323

🏁 Script executed:

# Let's look at the parser.rs file to understand how ParsedSegment is structured fd parser.rs src/tools/shell_analysis/

Repository: spacedriveapp/spacebot

Length of output: 100

🏁 Script executed:

# Check the parser file wc -l src/tools/shell_analysis/parser.rs

Repository: spacedriveapp/spacebot

Length of output: 104

🏁 Script executed:

# Look at the ParsedSegment definition and base_command extraction rg -A 10 "struct ParsedSegment" src/tools/shell_analysis/parser.rs

Repository: spacedriveapp/spacebot

Length of output: 401

🏁 Script executed:

# Search for base_command assignment in parser rg "base_command" src/tools/shell_analysis/parser.rs -B 3 -A 3

Repository: spacedriveapp/spacebot

Length of output: 654

🏁 Script executed:

# Get the full base_command function rg -A 20 "^fn base_command" src/tools/shell_analysis/parser.rs

Repository: spacedriveapp/spacebot

Length of output: 735

🏁 Script executed:

# Get the command_words function rg -B 5 -A 15 "^pub$crate$ fn command_words" src/tools/shell_analysis/parser.rs

Repository: spacedriveapp/spacebot

Length of output: 663

🏁 Script executed:

# Also check if there's any handling for common wrappers rg -i "env|wrapper|global.option" src/tools/shell_analysis/parser.rs

Repository: spacedriveapp/spacebot

Length of output: 178

🏁 Script executed:

# Let's check if there's usage of CategorizationResult for confirmation/gating rg "CategorizationResult" src/ -B 2 -A 5 | head -100

Repository: spacedriveapp/spacebot

Length of output: 4023

🏁 Script executed:

# Look for confirmation or gate-related logic rg -i "confirm|gate|check.*destructive|check.*network" src/tools/shell_analysis/ -B 2 -A 2

Repository: spacedriveapp/spacebot

Length of output: 3917

🏁 Script executed:

# Let's see the full command_words definition and any wrapper handling sed -n '1,100p' src/tools/shell_analysis/parser.rs | grep -A 30 "command_words"

Repository: spacedriveapp/spacebot

Length of output: 48

🏁 Script executed:

# Let's test the examples by tracing through the logic # First, let's see what command_words does with our examples cat > /tmp/test_parsing.py << 'EOF' # Simulate command_words behavior def is_env_assignment(word): if '=' not in word: return False name, _ = word.split('=', 1) if not name or '/' in name: return False if not name[0].isalpha() and name[0] not in ('_',): return False return all(c.isalnum() or c == '_' for c in name) def first_command_word_index(words): for i, word in enumerate(words): if not is_env_assignment(word): return i return None def command_words(words): idx = first_command_word_index(words) return words[idx:] if idx is not None else [] # Test examples examples = [ ["env", "FOO=1", "rm", "-rf", "target"], ["git", "-C", "repo", "reset", "--hard", "HEAD~1"], ["docker", "--context", "prod", "compose", "pull"], ] for ex in examples: cw = command_words(ex) base = cw[0] if cw else None subcommand = cw[1] if len(cw) > 1 else None print(f"Command: {' '.join(ex)}") print(f" base_command: {base}") print(f" subcommand (words[1]): {subcommand}") print() EOF python /tmp/test_parsing.py

Repository: spacedriveapp/spacebot

Length of output: 343

🏁 Script executed:

# Let's also check what happens with "env" - is it in the base_command filtering? # Let's check if "env" is handled as a wrapper rg "env" src/tools/shell_analysis/ -i -B 2 -A 2 | head -50

Repository: spacedriveapp/spacebot

Length of output: 2881

🏁 Script executed:

# Check the full segment_semantics for git and docker handling sed -n '166,324p' src/tools/shell_analysis/categorizer.rs

Repository: spacedriveapp/spacebot

Length of output: 5168

🏁 Script executed:

# Let's trace through what happens with git -C repo reset --hard cat > /tmp/test_git.py << 'EOF' # Simulate the categorizer logic for git -C repo reset --hard def command_words(words): # Only filters env assignments, not flags result = [] for word in words: if '=' not in word: result.append(word) else: # Check if it's an assignment name, _ = word.split('=', 1) if not name or '/' in name: result.append(word) else: continue return result if result else [] words = ["git", "-C", "repo", "reset", "--hard", "HEAD~1"] cw = command_words(words) base_command = cw[0] if cw else None subcommand = cw[1] if len(cw) > 1 else None print(f"Input: {' '.join(words)}") print(f"command_words result: {cw}") print(f"base_command: {base_command}") print(f"subcommand (words[1]): {subcommand}") print() print("Expected to match 'git' with subcommand 'reset':") print(f" base_command == 'git': {base_command == 'git'}") print(f" subcommand == 'reset': {subcommand == 'reset'}") print() print("What actually happens:") print(f" Match fails because subcommand is '{subcommand}' not 'reset'") EOF python /tmp/test_git.py

Repository: spacedriveapp/spacebot

Length of output: 409

🏁 Script executed:

# Check if there are any tests that show the expected behavior with global options rg "git.*-C\|docker.*--context\|env.*rm" src/tools/shell_analysis/ --type rs

Repository: spacedriveapp/spacebot

Length of output: 92

🏁 Script executed:

# Let's check if there's any wrapper or global option handling elsewhere rg -B 5 -A 10 "skip.*option\|skip.*flag\|wrapper.*exec\|normalize.*command" src/tools/shell_analysis/

Repository: spacedriveapp/spacebot

Length of output: 48

🏁 Script executed:

# Let's verify the confirmation logic actually uses has_destructive and has_network sed -n '1,200p' src/tools/shell_analysis/analyzer.rs | grep -A 20 "confirmation_reason"

Repository: spacedriveapp/spacebot

Length of output: 1903

🏁 Script executed:

# Test with an actual example - check test to see if it covers these cases rg "test.*git\|test.*docker\|test.*env" src/tools/shell_analysis/categorizer.rs -A 5

Repository: spacedriveapp/spacebot

Length of output: 48

🏁 Script executed:

# Let's check if there are existing tests that would catch this bug cat > /tmp/verify_bug.py << 'EOF' # Test the three examples from the review to confirm they bypass detection examples = { "env FOO=1 rm -rf target": { "expected": "Destructive", "actual_base": "env", "actual_subcommand": "FOO=1", "will_match": False, "reason": "env is not a destructive command and is in SEMANTIC_NEUTRAL_COMMANDS or not matched" }, "git -C repo reset --hard HEAD~1": { "expected": "Destructive (via git reset --hard)", "actual_base": "git", "actual_subcommand": "-C", "will_match": False, "reason": "subcommand is -C not reset, so pattern doesn't match" }, "docker --context prod compose pull": { "expected": "Network (via docker compose pull)", "actual_base": "docker", "actual_subcommand": "--context", "will_match": False, "reason": "subcommand is --context not compose, so pattern doesn't match" } } for cmd, info in examples.items(): print(f"Command: {cmd}") print(f" Expected: {info['expected']}") print(f" Actual base_command: {info['actual_base']}") print(f" Actual subcommand: {info['actual_subcommand']}") print(f" Will trigger confirmation: {info['will_match']}") print(f" Reason: {info['reason']}") print() EOF python /tmp/verify_bug.py

Repository: spacedriveapp/spacebot

Length of output: 803

🏁 Script executed:

# Check if "env" is in SEMANTIC_NEUTRAL_COMMANDS rg "SEMANTIC_NEUTRAL_COMMANDS" src/tools/shell_analysis/categorizer.rs -A 1

Repository: spacedriveapp/spacebot

Length of output: 310

🏁 Script executed:

# Let's verify what category "env FOO=1 rm -rf target" actually gets categorized as cat > /tmp/test_final.py << 'EOF' # Simulate the full categorizer for "env FOO=1 rm -rf target" # First segment would be just "env" with words ["env", "FOO=1", "rm", "-rf", "target"] # OR it could be parsed differently # Let's check: command_words skips env assignments def is_env_assignment(word): if '=' not in word: return False name, _ = word.split('=', 1) if not name or '/' in name: return False return all(c.isalnum() or c == '_' for c in name) def command_words(words): for i, word in enumerate(words): if not is_env_assignment(word): return words[i:] return [] # Test case 1: "env FOO=1 rm -rf target" as single segment words1 = ["env", "FOO=1", "rm", "-rf", "target"] cw1 = command_words(words1) print("Case 1: env FOO=1 rm -rf target (single segment)") print(f" Input words: {words1}") print(f" command_words result: {cw1}") print(f" base_command: {cw1[0] if cw1 else None}") print() # But wait - let's check the actual code again # base_command gets the FIRST command_word via Path::file_name # So for ["env", "FOO=1", "rm", ...], command_words gives ["env", "FOO=1", "rm", ...] # And base_command takes the first element "env" EOF python /tmp/test_final.py

Repository: spacedriveapp/spacebot

Length of output: 254

🏁 Script executed:

# Let's see exactly how the segments are created - are they split by pipes/redirects? rg -B 5 -A 10 "executable_segments\|ParsedPart" src/tools/shell_analysis/parser.rs | head -80

Repository: spacedriveapp/spacebot

Length of output: 48

🏁 Script executed:

# Check the test to see if there are any tests for wrapped commands rg "test_" src/tools/shell_analysis/categorizer.rs | head -20

Repository: spacedriveapp/spacebot

Length of output: 48

🏁 Script executed:

# Search for executable_segments definition rg "executable_segments" src/tools/shell_analysis/ -B 2 -A 10

Repository: spacedriveapp/spacebot

Length of output: 9639

🏁 Script executed:

# Let's look at the full parser to understand how segments are split sed -n '100,200p' src/tools/shell_analysis/parser.rs

Repository: spacedriveapp/spacebot

Length of output: 3628

🏁 Script executed:

# Check ParsedCommand structure and executable_segments rg "impl.*ParsedCommand" src/tools/shell_analysis/parser.rs -A 20

Repository: spacedriveapp/spacebot

Length of output: 737

🏁 Script executed:

# Let's verify the final categorization for each example cat > /tmp/final_verification.py << 'EOF' # Full trace through categorization logic SEMANTIC_NEUTRAL = {":", "echo", "false", "printf", "true"} DESTRUCTIVE_COMMANDS = {"dd", "mkfs", "rm", "shred"} NETWORK_COMMANDS = {"apt", "apt-get", "brew", "bun", "curl", "ftp", "npm", "pip", "pip3", "pnpm", "rsync", "scp", "sftp", "ssh", "telnet", "wget", "yarn", "docker"} def categorize(base_cmd, subcommand): semantics = { 'is_destructive': False, 'is_network': False, 'is_neutral': False, } if base_cmd in SEMANTIC_NEUTRAL: semantics['is_neutral'] = True return semantics semantics['is_destructive'] = base_cmd in DESTRUCTIVE_COMMANDS semantics['is_network'] = base_cmd in NETWORK_COMMANDS # Special handling for git if base_cmd == 'git': if subcommand == 'reset': # Simplified - in reality needs --hard check semantics['is_destructive'] = True # Special handling for docker if base_cmd == 'docker': if subcommand in ['build', 'compose', 'pull', 'push']: semantics['is_network'] = True return semantics examples = [ ("env", "FOO=1", "env FOO=1 rm -rf target"), ("git", "-C", "git -C repo reset --hard HEAD~1"), ("docker", "--context", "docker --context prod compose pull"), ] for base, sub, cmd in examples: sem = categorize(base, sub) print(f"Command: {cmd}") print(f" base_command: {base}") print(f" subcommand (words[1]): {sub}") print(f" is_destructive: {sem['is_destructive']}") print(f" is_network: {sem['is_network']}") if sem['is_destructive']: print(f" → Requires confirmation (destructive)") elif sem['is_network']: print(f" → May require confirmation (network)") else: print(f" → NO confirmation (safe/other)") print() EOF python /tmp/final_verification.py

Repository: spacedriveapp/spacebot

Length of output: 579

🏁 Script executed:

# Check if network commands actually require confirmation grep -A 30 "fn confirmation_reason" src/tools/shell_analysis/analyzer.rs

Repository: spacedriveapp/spacebot

Length of output: 995

🏁 Script executed:

# Final check: does network require confirmation? grep -B 5 -A 20 "pattern_requires_confirmation" src/tools/shell_analysis/analyzer.rs | head -50

Repository: spacedriveapp/spacebot

Length of output: 1549

🏁 Script executed:

# Check if network alone requires confirmation or only destructive rg "has_network" src/tools/shell_analysis/analyzer.rs

Repository: spacedriveapp/spacebot

Length of output: 103

🏁 Script executed:

# Get more context around has_network usage rg "has_network" src/tools/shell_analysis/analyzer.rs -B 5 -A 5

Repository: spacedriveapp/spacebot

Length of output: 381

🏁 Script executed:

# Let's create a comprehensive test to confirm the bug with all three examples cat > /tmp/comprehensive_test.py << 'EOF' # Comprehensive verification of the review comment # Based on code analysis: # 1. confirmation_reason() only adds to reasons if has_destructive OR pattern-specific # 2. has_network affects risk_level (Caution) but NOT confirmation requirement # 3. Confirmation is ONLY required for destructive or certain patterns print("=" * 70) print("VERIFICATION OF REVIEW COMMENT") print("=" * 70) print() examples = [ { "cmd": "env FOO=1 rm -rf target", "expected": "Should require confirmation (destructive)", "base_command": "env", "subcommand": "FOO=1", "in_destructive_set": False, "has_special_handling": False, "result": "NO confirmation required", "bug": True, }, { "cmd": "git -C repo reset --hard HEAD~1", "expected": "Should require confirmation (destructive git reset)", "base_command": "git", "subcommand": "-C", "subcommand_match": "reset", "subcommand_matches": False, "result": "NO confirmation required", "bug": True, }, { "cmd": "docker --context prod compose pull", "expected": "Should have network risk", "base_command": "docker", "subcommand": "--context", "in_network_set": True, "subcommand_match": "compose", "subcommand_matches": False, "result": "Risk=Caution, but NO confirmation required (network alone doesn't confirm)", "bug": True, "note": "Network risk detected but confirmation not triggered", } ] for i, ex in enumerate(examples, 1): print(f"{i}. {ex['cmd']}") print(f" Expected: {ex['expected']}") print(f" Result: {ex['result']}") print(f" Bug: {'YES' if ex['bug'] else 'NO'}") if 'note' in ex: print(f" Note: {ex['note']}") print() print("=" * 70) print("CONCLUSION: Review comment is CORRECT") print("=" * 70) print() print("Root cause: The categorizer assumes:") print(" • base_command is always the effective executable") print(" • subcommand is always words[1]") print() print("This fails when:") print(" • Wrapper executables like 'env' are used") print(" • Global options like '-C', '--context' precede the actual subcommand") print() print("Impact: Confirmation gate can be bypassed with common shell patterns") EOF python /tmp/comprehensive_test.py

Repository: spacedriveapp/spacebot

Length of output: 1289

Handle command-global options and wrapper executables in subcommand matching.

The matcher assumes words[1] always contains the subcommand, but common patterns bypass detection:

env FOO=1 rm -rf target → base="env", subcommand="FOO=1" (no confirmation despite destructive operation)

git -C repo reset --hard → base="git", subcommand="-C" (fails to match "reset", no confirmation)

docker --context prod compose pull → base="docker", subcommand="--context" (fails to match "compose")

The command_words() function correctly skips environment variable assignments but does not skip command-global options (e.g., -C, --context), causing subcommand matching to fail and destructive commands to bypass the confirmation gate. A helper function to normalize arguments by skipping leading command-global options before extracting the actual subcommand would close this gap.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/tools/shell_analysis/categorizer.rs` around lines 166 - 223, segment_semantics incorrectly assumes the subcommand is words.get(1) and misses cases where wrappers (e.g., "env") or command-global options (e.g., "-C", "--context") appear; add a helper (e.g., normalize_command_words) that takes the output of command_words(&segment.words) and strips leading wrapper executables ("env", etc.), environment assignments, and any leading global-style options (short/long flags and known global flags like -C/--context) until the real subcommand token is reached; then replace the current subcommand derivation (words.get(1).map(String::as_str)) with one based on the normalized list and use that normalized list for checks in the "git", "docker", "chmod" branches (and any flag helpers like recursive_flag_present/force_flag_present/long_flag_present as needed).

coderabbitai · 2026-04-04T11:38:37Z

src/tools/shell.rs

+        let analysis = self.analyzer.analyze(&args.command, &working_dir);
+        if analysis.requires_confirmation {
+            return Err(ShellError {
+                message: format!(
+                    "Command requires confirmation: {}",
+                    analysis
+                        .confirmation_reason
+                        .as_deref()
+                        .unwrap_or("the command was flagged as risky before execution")
+                ),
+                exit_code: -1,
+            });
+        }


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check if there's any confirmation bypass mechanism elsewhere in the codebase rg -n "requires_confirmation|confirmation_reason" --type=rs -C3

Repository: spacedriveapp/spacebot

Length of output: 92

🏁 Script executed:

#!/bin/bash # Search for confirmation-related code in Rust files rg -n "requires_confirmation|confirmation_reason" --glob="*.rs" -C3

Repository: spacedriveapp/spacebot

Length of output: 5534

🏁 Script executed:

#!/bin/bash # Also check the PR description or comments about the intended behavior git log --oneline --all -n 20 | head -20

Repository: spacedriveapp/spacebot

Length of output: 1270

🏁 Script executed:

#!/bin/bash # Check if there's a bypass flag or confirmation approval mechanism rg -n "bypass|approval|confirm.*true|force" --glob="*.rs" src/tools/shell.rs -C2

Repository: spacedriveapp/spacebot

Length of output: 592

🏁 Script executed:

#!/bin/bash # Check the function signature and args for the shell tool rg -n "struct.*Args|pub.*confirm|confirmed.*execution" --glob="*.rs" src/tools/shell.rs -C5

Repository: spacedriveapp/spacebot

Length of output: 444

🏁 Script executed:

#!/bin/bash # Check if there's a separate execution path or function for confirmed commands rg -n "async fn.*execute|async fn.*call" --glob="*.rs" src/tools/shell.rs -B2 -A10

Repository: spacedriveapp/spacebot

Length of output: 718

🏁 Script executed:

#!/bin/bash # Look for the full commit message of the most recent pre-execution analysis commit git show --format=fuller 0268b9b | head -100

Repository: spacedriveapp/spacebot

Length of output: 3816

🏁 Script executed:

#!/bin/bash # Check if there are any tests that show confirmation being handled rg -n "requires_confirmation.*true|blocked.*confirmation" --glob="*.rs" -A10 -B2

Repository: spacedriveapp/spacebot

Length of output: 48

🏁 Script executed:

#!/bin/bash # Look at more complete context around the error return to see if there's any other path sed -n '230,260p' src/tools/shell.rs

Repository: spacedriveapp/spacebot

Length of output: 1243

🏁 Script executed:

#!/bin/bash # Check if there's a bypass or force flag in any of the args or config rg -n "force|bypass|override.*confirm|skip.*check" --glob="*.rs" src/tools/

Repository: spacedriveapp/spacebot

Length of output: 1835

Hard-blocked commands have no recovery pathway despite commit message promising "pending confirmation."

The PR description states "Risky commands are blocked pending confirmation," but ShellArgs contains no confirmation flag, and the tool returns ShellError immediately with no mechanism to re-execute after approval. The LLM cannot proceed even if the user intends to run the flagged command.

Either add a confirm: bool parameter to allow bypassing the check on confirmed calls, or clarify that this is an intentional hard block and update the commit message accordingly.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/tools/shell.rs` around lines 241 - 253, The analyzer is blocking "risky" commands with no way to proceed; add a confirmation pathway by adding a confirm: bool field to ShellArgs (or equivalent caller payload) and update the check in the function that calls self.analyzer.analyze(&args.command, &working_dir) to only return the ShellError when requires_confirmation is true AND args.confirm is false; if args.confirm is true, allow execution to continue (or set a clear pending state), and ensure any user-facing messages reflect that confirmation was used (use ShellError only for unconfirmed attempts).

tembo bot reviewed Apr 4, 2026

View reviewed changes

coderabbitai bot reviewed Apr 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add pre-execution shell command analysis#534

Add pre-execution shell command analysis#534
jamiepine wants to merge 1 commit intomainfrom
command-analysis

jamiepine commented Apr 4, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Apr 4, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

tembo bot Apr 4, 2026

Uh oh!

tembo bot Apr 4, 2026

Uh oh!

tembo bot Apr 4, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Apr 4, 2026

Uh oh!

coderabbitai bot Apr 4, 2026

Uh oh!

coderabbitai bot Apr 4, 2026

Uh oh!

coderabbitai bot Apr 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

-        if character == '\r' && !in_double_quote {
+        if character == '\r' && !in_single_quote && !in_double_quote {
+            patterns.push(pattern(
+                PatternType::CarriageReturn,
+                "Command contains a carriage return outside quotes.",
+            ));
+            continue;
+        }

Conversation

jamiepine commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

coderabbitai bot commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

tembo bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

tembo bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

tembo bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jamiepine commented Apr 4, 2026 •

edited

Loading

coderabbitai bot commented Apr 4, 2026 •

edited

Loading