Skip to content

Security Audit: Critical findings in shell scripts, sandboxing, and data handlingΒ #13

@Azarmyr

Description

@Azarmyr

Security Audit Report

I performed a comprehensive security audit of this repository (every file β€” source, scripts, configs, docs). Here are the findings.


πŸ”΄ Critical

1. Command injection in scripts/mac/interact-element.sh
Dynamically constructs JXA code from unsanitized shell arguments via osascript -l JavaScript. Any untrusted input passed as arguments could lead to arbitrary code execution with the current user's permissions.

2. Unsandboxed native desktop control (src/native-desktop.ts)
Uses @nut-tree-fork/nut-js for raw mouse/keyboard control with zero sandboxing. The agent runs with the same permissions as the user β€” can click, type, and drag anywhere on screen, in any application.

3. TOCTOU vulnerability in src/index.ts
The CLI task command writes dynamically generated scripts to os.tmpdir() (typically /tmp) before executing them. This creates a Time-of-Check to Time-of-Use race condition where the script could be swapped between write and execution. Additionally, PowerShell is launched with -ExecutionPolicy Bypass on Windows.

4. Keystroke injection surface (interact-element.ps1 + interact-element.sh)
The SendKeys/keystroke actions allow typing arbitrary text into any application. If the controlling LLM is prompt-injected (e.g., via a malicious webpage captured in a screenshot), it could instruct the agent to type and execute shell commands in a terminal.


🟑 Suspicious

5. Screen data sent to third-party APIs
src/ai-brain.ts, src/computer-use.ts, and src/a11y-reasoner.ts send screenshots and accessibility trees to remote LLM providers. Any sensitive information visible on screen (passwords, messages, code) is transmitted. This is expected behavior for the tool's purpose, but represents a significant privacy surface with no granular controls.

6. Full browser control via CDP (src/browser-layer.ts, src/cdp-driver.ts)
Connects to Chrome via 127.0.0.1:9222 (remote debugging port). CDP grants complete browser control: navigation, page content reading, arbitrary JavaScript execution via page.evaluate(), bypassing Same-Origin Policy.

7. Dynamic JXA execution in scripts/mac/get-ui-tree.sh
Same pattern as interact-element.sh β€” constructs JXA dynamically from shell arguments without input sanitization.

8. SKILL.md autonomy directives
The skill file instructs the AI to operate with very high autonomy: "If a human can do it on a screen, you can too", "Be independent." While safety rails exist for destructive actions, the default stance encourages proactive operation without confirmation.

9. safety.ts relies on simple regex
The safety module uses basic regex pattern matching against LLM-generated output. This is easily bypassed and provides limited real-world protection.


🟒 Clean

  • All config files (package.json, tsconfig, .env.example, .gitignore, LICENSE) β€” no issues
  • No postinstall hooks or suspicious dependencies
  • All pure JXA files β€” standard UI automation, no injection vectors
  • Docs, HTML, SVG, CNAME, perf patches β€” no prompt injection, no hidden content
  • No hardcoded API keys or secrets in source code

Methodology

  • Full file-by-file review of all 59 files in the repository
  • Automated and manual analysis of all TypeScript source, PowerShell scripts, JXA/Bash scripts, configs, and documentation
  • Checked for: obfuscation, network calls, code execution, data exfiltration, credential handling, crypto operations, supply chain risks, prompt injection, hidden content

Great project concept, just flagging these for awareness. Happy to discuss any of these findings.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions