Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions docs/design-docs/production-worker-failures.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,9 @@ Shell `find` command traversed node_modules directory. Returned 5,000+ entries (
**Impact:**
Single tool call consumed ~8,000 tokens. Multiple such calls in sequence rapidly approached context limit.

**Current Mitigation:**
The shell tool now emits pre-execution `analysis` metadata with command category, risk level, duration hint, and UX flags like `collapsed_by_default` and `expects_no_output`. That lets downstream UI code collapse search/read/list output and render silent successes as `Done` without re-parsing the raw command string.

---

### Working Directory Mismatch
Expand Down
4 changes: 2 additions & 2 deletions prompts/en/tools/shell_description.md.j2
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Execute a shell command. Use this for file operations, running scripts, building projects, git commands, running subprocesses, and any system-level operations. Be careful with destructive operations. The command runs with a 60 second timeout by default.
Execute a shell command. Use this for file operations, running scripts, building projects, git commands, running subprocesses, and any system-level operations. Commands are analyzed before execution, and destructive or suspicious patterns may be rejected pending confirmation. The command runs with a 60 second timeout by default.

Use the optional `env` parameter to set per-command environment variables (e.g. `[{"key": "RUST_LOG", "value": "debug"}]`). Dangerous variables that enable library injection (LD_PRELOAD, NODE_OPTIONS, etc.) are blocked.

To install tools that persist across restarts, place binaries in the persistent tools directory at $SPACEBOT_DIR/tools/bin (already on PATH). For example: `curl -fsSL https://example.com/tool -o $SPACEBOT_DIR/tools/bin/tool && chmod +x $SPACEBOT_DIR/tools/bin/tool`
To install tools that persist across restarts, place binaries in the persistent tools directory at $SPACEBOT_DIR/tools/bin (already on PATH). For example: `curl -fsSL https://example.com/tool -o $SPACEBOT_DIR/tools/bin/tool && chmod +x $SPACEBOT_DIR/tools/bin/tool`
9 changes: 6 additions & 3 deletions src/agent/channel.rs
Original file line number Diff line number Diff line change
Expand Up @@ -944,9 +944,11 @@ impl Channel {
"/quiet" | "/observe" => {
self.set_response_mode(ResponseMode::Observe).await;
self.send_builtin_text(
"observe mode enabled. i'll learn from this conversation but won't respond.".to_string(),
"observe mode enabled. i'll learn from this conversation but won't respond."
.to_string(),
"observe",
).await;
)
.await;
return Ok(true);
}
"/active" => {
Expand Down Expand Up @@ -976,7 +978,8 @@ impl Channel {
"- /tasks: ready task list".to_string(),
"- /digest: one-shot day digest (00:00 -> now)".to_string(),
"- /observe: learn from conversation, never respond".to_string(),
"- /mention-only: only respond when @mentioned, replied to, or given a command".to_string(),
"- /mention-only: only respond when @mentioned, replied to, or given a command"
.to_string(),
"- /active: normal reply mode".to_string(),
"- /agent-id: runtime agent id".to_string(),
];
Expand Down
4 changes: 3 additions & 1 deletion src/config/load.rs
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,9 @@ fn parse_response_mode(
// Backwards compat: listen_only_mode maps to response_mode
match listen_only_mode {
Some(true) => {
tracing::warn!("listen_only_mode is deprecated, use response_mode = \"observe\" instead");
tracing::warn!(
"listen_only_mode is deprecated, use response_mode = \"observe\" instead"
);
Some(ResponseMode::Observe)
}
Some(false) => Some(ResponseMode::Active),
Expand Down
4 changes: 4 additions & 0 deletions src/tools.rs
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ pub mod send_file;
pub mod send_message_to_another_channel;
pub mod set_status;
pub mod shell;
pub mod shell_analysis;
pub mod skills_search;
pub mod skip;
pub mod spacebot_docs;
Expand Down Expand Up @@ -128,6 +129,9 @@ pub use send_message_to_another_channel::{
};
pub use set_status::{SetStatusArgs, SetStatusError, SetStatusOutput, SetStatusTool, StatusKind};
pub use shell::{EnvVar, ShellArgs, ShellError, ShellOutput, ShellResult, ShellTool};
pub use shell_analysis::{
CommandAnalysis, CommandCategory, DetectedPattern, DurationHint, PatternType, RiskLevel,
};
pub use skills_search::{
SkillsSearchArgs, SkillsSearchError, SkillsSearchOutput, SkillsSearchTool,
};
Expand Down
49 changes: 42 additions & 7 deletions src/tools/shell.rs
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
//! Shell tool for executing shell commands and subprocesses (task workers only).
//!
//! This is the unified execution tool — it replaces the previous `shell` + `exec`
//! split. Commands run through `sh -c` with optional per-command environment
//! variables. Dangerous env vars that enable library injection are blocked.
//! split. Commands are analyzed before execution, then run through `sh -c` with
//! optional per-command environment variables. Dangerous env vars that enable
//! library injection are blocked.

use crate::sandbox::Sandbox;
use crate::tools::shell_analysis::{CommandAnalysis, ShellAnalyzer};
use rig::completion::ToolDefinition;
use rig::tool::Tool;
use schemars::JsonSchema;
Expand Down Expand Up @@ -37,12 +39,19 @@ const DANGEROUS_ENV_VARS: &[&str] = &[
pub struct ShellTool {
workspace: PathBuf,
sandbox: Arc<Sandbox>,
analyzer: ShellAnalyzer,
}

impl ShellTool {
/// Create a new shell tool with sandbox containment.
pub fn new(workspace: PathBuf, sandbox: Arc<Sandbox>) -> Self {
Self { workspace, sandbox }
let analyzer = ShellAnalyzer::new(workspace.clone());

Self {
workspace,
sandbox,
analyzer,
}
}
}

Expand Down Expand Up @@ -98,6 +107,8 @@ pub struct ShellOutput {
pub stderr: String,
/// Formatted summary for LLM consumption.
pub summary: String,
/// Pre-execution analysis metadata for UI and worker logic.
pub analysis: CommandAnalysis,
}

impl Tool for ShellTool {
Expand Down Expand Up @@ -227,6 +238,20 @@ impl Tool for ShellTool {
}
}

let analysis = self.analyzer.analyze(&args.command, &working_dir);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heads-up: analysis.requires_confirmation currently hard-blocks execution, but ShellArgs doesn't include any confirm/ack flag. If the intention is “ask user then re-run”, we probably need an explicit arg (default false) or a structured “needs_confirmation” error that carries analysis so the caller can display details and retry.

if analysis.requires_confirmation {
return Err(ShellError {
message: format!(
"Command requires confirmation: {}",
analysis
.confirmation_reason
.as_deref()
.unwrap_or("the command was flagged as risky before execution")
),
exit_code: -1,
});
}
Comment on lines +241 to +253
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if there's any confirmation bypass mechanism elsewhere in the codebase
rg -n "requires_confirmation|confirmation_reason" --type=rs -C3

Repository: spacedriveapp/spacebot

Length of output: 92


🏁 Script executed:

#!/bin/bash
# Search for confirmation-related code in Rust files
rg -n "requires_confirmation|confirmation_reason" --glob="*.rs" -C3

Repository: spacedriveapp/spacebot

Length of output: 5534


🏁 Script executed:

#!/bin/bash
# Also check the PR description or comments about the intended behavior
git log --oneline --all -n 20 | head -20

Repository: spacedriveapp/spacebot

Length of output: 1270


🏁 Script executed:

#!/bin/bash
# Check if there's a bypass flag or confirmation approval mechanism
rg -n "bypass|approval|confirm.*true|force" --glob="*.rs" src/tools/shell.rs -C2

Repository: spacedriveapp/spacebot

Length of output: 592


🏁 Script executed:

#!/bin/bash
# Check the function signature and args for the shell tool
rg -n "struct.*Args|pub.*confirm|confirmed.*execution" --glob="*.rs" src/tools/shell.rs -C5

Repository: spacedriveapp/spacebot

Length of output: 444


🏁 Script executed:

#!/bin/bash
# Check if there's a separate execution path or function for confirmed commands
rg -n "async fn.*execute|async fn.*call" --glob="*.rs" src/tools/shell.rs -B2 -A10

Repository: spacedriveapp/spacebot

Length of output: 718


🏁 Script executed:

#!/bin/bash
# Look for the full commit message of the most recent pre-execution analysis commit
git show --format=fuller 0268b9b | head -100

Repository: spacedriveapp/spacebot

Length of output: 3816


🏁 Script executed:

#!/bin/bash
# Check if there are any tests that show confirmation being handled
rg -n "requires_confirmation.*true|blocked.*confirmation" --glob="*.rs" -A10 -B2

Repository: spacedriveapp/spacebot

Length of output: 48


🏁 Script executed:

#!/bin/bash
# Look at more complete context around the error return to see if there's any other path
sed -n '230,260p' src/tools/shell.rs

Repository: spacedriveapp/spacebot

Length of output: 1243


🏁 Script executed:

#!/bin/bash
# Check if there's a bypass or force flag in any of the args or config
rg -n "force|bypass|override.*confirm|skip.*check" --glob="*.rs" src/tools/

Repository: spacedriveapp/spacebot

Length of output: 1835


Hard-blocked commands have no recovery pathway despite commit message promising "pending confirmation."

The PR description states "Risky commands are blocked pending confirmation," but ShellArgs contains no confirmation flag, and the tool returns ShellError immediately with no mechanism to re-execute after approval. The LLM cannot proceed even if the user intends to run the flagged command.

Either add a confirm: bool parameter to allow bypassing the check on confirmed calls, or clarify that this is an intentional hard block and update the commit message accordingly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/shell.rs` around lines 241 - 253, The analyzer is blocking "risky"
commands with no way to proceed; add a confirmation pathway by adding a confirm:
bool field to ShellArgs (or equivalent caller payload) and update the check in
the function that calls self.analyzer.analyze(&args.command, &working_dir) to
only return the ShellError when requires_confirmation is true AND args.confirm
is false; if args.confirm is true, allow execution to continue (or set a clear
pending state), and ensure any user-facing messages reflect that confirmation
was used (use ShellError only for unconfirmed attempts).


// Build per-command env map for sandbox-aware injection. The sandbox
// injects these via --setenv (bubblewrap) or .env() (other backends),
// so they always reach the inner sandboxed process.
Expand Down Expand Up @@ -270,20 +295,26 @@ impl Tool for ShellTool {
let exit_code = output.status.code().unwrap_or(-1);
let success = output.status.success();

let summary = format_shell_output(exit_code, &stdout, &stderr);
let summary = format_shell_output(exit_code, &stdout, &stderr, analysis.expects_no_output);

Ok(ShellOutput {
success,
exit_code,
stdout,
stderr,
summary,
analysis,
})
}
}

/// Format shell output for display.
fn format_shell_output(exit_code: i32, stdout: &str, stderr: &str) -> String {
fn format_shell_output(
exit_code: i32,
stdout: &str,
stderr: &str,
expects_no_output: bool,
) -> String {
let mut output = String::new();

output.push_str(&format!("Exit code: {}\n", exit_code));
Expand All @@ -299,7 +330,11 @@ fn format_shell_output(exit_code: i32, stdout: &str, stderr: &str) -> String {
}

if stdout.is_empty() && stderr.is_empty() {
output.push_str("\n[No output]\n");
if exit_code == 0 && expects_no_output {
output.push_str("\nDone\n");
} else {
output.push_str("\n[No output]\n");
}
}

output
Expand Down Expand Up @@ -354,6 +389,6 @@ pub struct ShellResult {
impl ShellResult {
/// Format as a readable string for LLM consumption.
pub fn format(&self) -> String {
format_shell_output(self.exit_code, &self.stdout, &self.stderr)
format_shell_output(self.exit_code, &self.stdout, &self.stderr, false)
}
}
12 changes: 12 additions & 0 deletions src/tools/shell_analysis.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
//! Pre-execution analysis for shell commands.

mod analyzer;
mod categorizer;
mod parser;
mod security;
mod types;

pub(crate) use analyzer::ShellAnalyzer;
pub use types::{
CommandAnalysis, CommandCategory, DetectedPattern, DurationHint, PatternType, RiskLevel,
};
Loading
Loading