WASM shell commands spawn a new Node.js process + JIT compile per command

## Summary

Every WASM shell command (`ls`, `cat`, `pwd`, etc.) spawns a new Node.js child process and runs `WebAssembly.compile()` on the full module binary from scratch. On environments with low single-thread performance or high I/O latency (e.g., ECS Fargate with EFS mounts), this per-command overhead reaches ~5 seconds — causing a 15-command agent session to hit the 120-second ACP timeout.

The existing `openShell()` API already provides a persistent shell mechanism, but `exec()` does not use it. Routing exec through a persistent shell would reduce overhead to a single spawn at session start.

## Per-command execution flow

Each call to `exec("ls /workspace")` follows this path:

1. `NativeKernelProxy.exec()` → `resolveExecCommand()` wraps as `sh -c 'ls /workspace'` (`native-kernel-proxy.ts:357-406`)
2. `NativeKernelProxy.spawn()` → sidecar RPC (`native-kernel-proxy.ts:408`)
3. `WasmExecutionEngine::start_execution()` (`wasm.rs:355-450`)
4. `create_node_child()` — `Command::new(node_binary()).spawn()` — **new OS process** (`wasm.rs:486-524`)
5. `fs.readFile(modulePath)` — reads multi-MB WASM binary from disk (`node_import_cache.rs:7886`)
6. `WebAssembly.compile(moduleBytes)` — **full JIT compile** (`node_import_cache.rs:7889`)
7. WASI init → execute → process exits — compiled module discarded

Steps 4–7 repeat identically for every `ls`, `cat`, `pwd`. There is no process or module reuse.

## Why existing caches don't help

| Mechanism | What it caches | Helps with WASM compile? | Helps with process spawn? |
|-----------|---------------|--------------------------|---------------------------|
| Prewarm (`wasm.rs:527`) | Marker file to skip re-prewarm | No — compiled module discarded on `process.exit(0)` (`node_import_cache.rs:7891`) | No |
| Node.js compile cache (`runtime_support.rs:33`) | JS source → V8 bytecode | No — JS only, not WASM | No |
| Import cache (`NodeImportCache`) | Runner/loader files on disk | No | No |

## Impact in resource-constrained environments

On macOS (Apple Silicon), the per-command overhead is ~19ms — invisible in practice. But on environments with weaker single-thread CPU or network-attached storage, the fixed costs scale dramatically.

**Example: ECS Fargate with EFS workspace** (from production logs):

| Operation | Time |
|-----------|------|
| `ls /workspace` | ~5.3s |
| `mkdir -p /workspace` | ~5.5s |
| `pwd` | ~4.3s |
| 15 commands total | ~75s → ACP timeout at 120s |

Contributing factors on such environments:
- WASM binary read from network filesystem: ~200-500ms (vs ~3ms local SSD)
- `WebAssembly.compile()` on lower-clock vCPU: ~500-2000ms (vs ~5-10ms Apple Silicon)
- Node.js process spawn overhead: ~100-500ms (vs ~5ms)

## Proposed fix: route exec through a persistent shell

`openShell()` (`native-kernel-proxy.ts:499-544`) already spawns a persistent `sh` process with stdin write and stdout callbacks. `AgentOs` exposes it publicly (`agent-os.ts:1814-1849`). The coreutils WASM binary is a multicall (BusyBox-style) binary, so `ls`, `cat`, etc. execute as builtins within the persistent `sh` without additional WASM spawns.

```
Current:  [Node spawn → WASM read → JIT compile → WASI init → sh -c "ls"] × N
Proposed: [Node spawn → WASM read → JIT compile → WASI init → sh]          × 1
          [stdin "ls\n" → stdout]                                           × N
```

Implementation options (broadest-impact first):

1. **`NativeKernelProxy.exec()`** (`native-kernel-proxy.ts:357`) — intercept `sh -c` commands and route through a lazily-initialized persistent shell
2. **`child_process` polyfill bridge** (`node_import_cache.rs:3693`) — benefits guest code calling `child_process.spawn("sh", ...)` (agent SDKs like Claude CLI use this path)
3. **`AgentOs.exec()`** (`agent-os.ts`) — simplest, benefits direct callers only

Design considerations:
- **Output boundary detection**: delimiter pattern (e.g., `printf "___DELIM_%d___\n" "$?"`) to mark command end and capture exit code
- **Fallback**: commands needing streaming `onStdout`/`onStderr` callbacks bypass the persistent shell
- **Initialization sync**: echo a known token after spawn, wait for it before accepting commands

## Verification

> **Note:** The benchmarks below were run on a local macOS (Apple Silicon) machine — not on Fargate or similar resource-constrained environments. To validate the persistent shell approach without access to such environments, the simulated-latency benchmark injects an artificial delay into `kernel.exec()` to approximate the per-spawn overhead observed in production logs. The absolute numbers differ from real Fargate, but the relative improvement (spawn bypass) accurately reflects the mechanism.

### Benchmark: baseline overhead per layer

The following script measures per-layer overhead of WASM command execution. Install `@rivet-dev/agent-os-core` and `@rivet-dev/agent-os-common`, then run with `node --import tsx bench-wasm-exec.ts`.

<details>
<summary>bench-wasm-exec.ts</summary>

```typescript
import { AgentOs } from "@rivet-dev/agent-os-core";
import type { SoftwareInput } from "@rivet-dev/agent-os-core";
import common from "@rivet-dev/agent-os-common";

const software: SoftwareInput[] = [common];

interface BenchResult {
  label: string;
  durationMs: number;
  stdout?: string;
  stderr?: string;
}

async function bench(
  label: string,
  fn: () => Promise<{ stdout?: string; stderr?: string } | void>,
): Promise<BenchResult> {
  const start = performance.now();
  const result = await fn();
  const durationMs = performance.now() - start;
  return { label, durationMs, stdout: result?.stdout, stderr: result?.stderr };
}

function printResult(r: BenchResult) {
  const ms = r.durationMs.toFixed(1);
  console.log(`  ${r.label.padEnd(45)} ${ms.padStart(8)} ms`);
}

async function main() {
  console.log("=== WASM command execution benchmark ===\n");
  console.log(`Platform: ${process.platform} ${process.arch}`);
  console.log(`Node.js:  ${process.version}\n`);

  // Phase 1: VM creation
  console.log("--- Phase 1: VM creation ---");
  const vmStart = performance.now();
  const vm = await AgentOs.create({ software });
  console.log(`  AgentOs.create() ${(performance.now() - vmStart).toFixed(1).padStart(30)} ms`);
  await vm.mkdir("/workspace", { recursive: true });

  // Phase 2: Kernel API baseline (non-WASM)
  console.log("\n--- Phase 2: Kernel API baseline ---");
  const kernelResults: BenchResult[] = [];
  await vm.writeFile("/workspace/test.txt", "Hello, World!\n".repeat(7));
  kernelResults.push(await bench("kernel.readFile() (100B)", () => vm.readFile("/workspace/test.txt").then(() => {})));
  kernelResults.push(await bench("kernel.stat()", () => vm.stat("/workspace/test.txt").then(() => {})));
  kernelResults.push(await bench("kernel.readdir()", () => vm.readdir("/workspace").then(() => {})));
  kernelResults.push(await bench("kernel.exists()", () => vm.exists("/workspace/test.txt").then(() => {})));
  for (const r of kernelResults) printResult(r);

  // Phase 3: WASM cold start
  console.log("\n--- Phase 3: WASM cold start ---");
  const cold = await bench("exec('echo hello') — cold", () => vm.exec("echo hello"));
  printResult(cold);

  // Phase 4: WASM warm (repeated)
  console.log("\n--- Phase 4: WASM warm (×5) ---");
  const warmResults: BenchResult[] = [];
  for (let i = 0; i < 5; i++) {
    warmResults.push(await bench(`exec('echo hello') — warm #${i + 1}`, () => vm.exec("echo hello")));
  }
  for (const r of warmResults) printResult(r);

  // Phase 5: Various commands
  console.log("\n--- Phase 5: Various commands (warm) ---");
  for (const cmd of ["ls /workspace", "cat /workspace/test.txt", "pwd", "echo test", "wc -c /workspace/test.txt"]) {
    printResult(await bench(`exec('${cmd}')`, () => vm.exec(cmd)));
  }

  // Phase 6: Kernel vs WASM comparison
  console.log("\n--- Phase 6: Kernel API vs WASM ---");
  const kr = await bench("kernel.readFile()", () => vm.readFile("/workspace/test.txt").then(() => {}));
  const wr = await bench("exec('cat ...')", () => vm.exec("cat /workspace/test.txt"));
  printResult(kr);
  printResult(wr);
  console.log(`  → WASM overhead: ${(wr.durationMs / kr.durationMs).toFixed(0)}x\n`);

  // Phase 7: Batch throughput (15 commands)
  console.log("--- Phase 7: 15-command batch ---");
  const times: number[] = [];
  for (let i = 0; i < 15; i++) {
    const s = performance.now();
    await vm.exec(`echo iteration_${i}`);
    times.push(performance.now() - s);
  }
  const total = times.reduce((a, b) => a + b, 0);
  console.log(`  Total:   ${total.toFixed(1)} ms`);
  console.log(`  Average: ${(total / times.length).toFixed(1)} ms/cmd`);
  console.log(`  Min:     ${Math.min(...times).toFixed(1)} ms`);
  console.log(`  Max:     ${Math.max(...times).toFixed(1)} ms`);
  console.log(`  120s timeout reached: ${total > 120000 ? "YES" : "NO"}`);

  await vm.dispose();
}

main().catch((err) => { console.error(err); process.exit(1); });
```

</details>

**Results (macOS Apple Silicon, Node.js v24.14.1):**

```
Kernel API (readFile, stat):    ~0.1 ms
WASM exec warm (echo):          ~19 ms
WASM exec warm (ls):             ~48 ms
15× echo batch:                 283 ms (18.9 ms/cmd)
```

### Benchmark: simulated spawn latency (before/after persistent shell)

To validate the persistent shell approach without a Fargate environment, this benchmark injects an artificial 500ms delay into `kernel.exec()` to simulate the per-spawn overhead. The persistent shell patch bypasses `kernel.exec()` after the first shell initialization, so only the initial spawn incurs the delay.

Run without patch for "before", apply patch then re-run for "after".

<details>
<summary>bench-simulated-latency.ts</summary>

```typescript
import { AgentOs } from "@rivet-dev/agent-os-core";
import type { AgentOs as AgentOsType, SoftwareInput } from "@rivet-dev/agent-os-core";
import common from "@rivet-dev/agent-os-common";

const software: SoftwareInput[] = [common];
const SPAWN_DELAY_MS = parseInt(process.env.SIMULATED_SPAWN_DELAY_MS ?? "500", 10);

function injectSpawnDelay(vm: AgentOsType, delayMs: number) {
  const kernel = (vm as any).kernel;
  if (!kernel) { console.warn("[warn] kernel not accessible"); return; }
  const originalKernelExec = kernel.exec.bind(kernel);
  kernel.exec = async (command: string, options?: any) => {
    await new Promise((r) => setTimeout(r, delayMs));
    return originalKernelExec(command, options);
  };
}

async function main() {
  console.log("=== Simulated spawn-latency before/after benchmark ===\n");
  console.log(`Platform:        ${process.platform} ${process.arch}`);
  console.log(`Node.js:         ${process.version}`);
  console.log(`Simulated delay: ${SPAWN_DELAY_MS} ms/spawn\n`);

  const vm = await AgentOs.create({ software });
  await vm.mkdir("/workspace", { recursive: true });
  await vm.writeFile("/workspace/test.txt", "Hello World\n".repeat(10));

  // Detect persistent shell patch
  await vm.exec("echo detect");
  const isPatched = !!(vm as any).__persistentShell;
  const mode = isPatched ? "AFTER (persistent shell)" : "BEFORE (original exec)";
  console.log(`Mode: ${mode}\n`);

  // Inject delay AFTER patch detection (so init shell spawn is unaffected)
  injectSpawnDelay(vm, SPAWN_DELAY_MS);

  // echo ×15
  console.log("--- echo ×15 ---");
  const warmupStart = performance.now();
  await vm.exec("echo warmup");
  console.log(`  Warmup:  ${(performance.now() - warmupStart).toFixed(1)} ms`);

  const times: number[] = [];
  for (let i = 0; i < 15; i++) {
    const s = performance.now();
    await vm.exec(`echo iteration_${i}`);
    times.push(performance.now() - s);
  }
  const total = times.reduce((a, b) => a + b, 0);
  console.log(`  Total:   ${total.toFixed(1)} ms`);
  console.log(`  Average: ${(total / times.length).toFixed(1)} ms/cmd`);

  // Various commands
  console.log("\n--- Various commands ---");
  for (const cmd of ["ls /workspace", "cat /workspace/test.txt", "pwd", "echo hello", "wc -c /workspace/test.txt"]) {
    const s = performance.now();
    await vm.exec(cmd);
    console.log(`  ${cmd.padEnd(35)} ${(performance.now() - s).toFixed(1).padStart(8)} ms`);
  }

  // Summary
  console.log(`\n=== Summary ===`);
  console.log(`Mode:    ${mode}`);
  console.log(`echo×15: ${total.toFixed(1)} ms (${(total / 15).toFixed(1)} ms/cmd)`);
  if (!isPatched) {
    console.log(`→ ${((SPAWN_DELAY_MS * 15 / total) * 100).toFixed(0)}% of time is spawn overhead`);
  }

  await vm.dispose();
}

main().catch((err) => { console.error(err); process.exit(1); });
```

</details>

### PoC patch: persistent shell for AgentOs.exec()

This monkey-patch replaces `AgentOs.exec()` to lazily initialize a persistent WASM shell via `openShell()` and route subsequent commands through stdin with delimiter-based output boundary detection.

<details>
<summary>apply-patch.cjs — patches the compiled agent-os-core dist</summary>

```javascript
#!/usr/bin/env node
const fs = require("fs");

const file = process.argv[2];
if (!file) { console.error("Usage: node apply-patch.cjs <agent-os.js path>"); process.exit(1); }

let src = fs.readFileSync(file, "utf8");
if (src.includes("__persistentShell")) { console.log("[patch] already patched"); process.exit(0); }

const originalExec = `    async exec(command, options) {
        return this.kernel.exec(command, options);
    }`;

const patchedExec = `    async exec(command, options) {
        // --- Persistent shell patch ---
        if (!this.__persistentShell) {
            this.__persistentShell = this._initPersistentShell();
        }
        const shell = await this.__persistentShell;
        if (shell) {
            return shell.exec(command, options);
        }
        return this.kernel.exec(command, options);
    }
    _initPersistentShell() {
        const self = this;
        return new Promise((resolveInit) => {
            try {
                const { shellId } = self.openShell();
                let initialized = false;
                let buffer = '';
                let cmdCounter = 0;
                let pendingResolve = null;
                let pendingDelimId = '';
                const decoder = new TextDecoder();

                const initDelim = '__AOSINIT_' + Date.now() + '__';
                let initBuffer = '';

                self.onShellData(shellId, (data) => {
                    const chunk = decoder.decode(data);

                    if (!initialized) {
                        initBuffer += chunk;
                        if (initBuffer.includes(initDelim)) {
                            initialized = true;
                            initBuffer = '';
                            resolveInit(shellApi);
                        }
                        return;
                    }

                    if (!pendingResolve) return;
                    buffer += chunk;

                    const delimRe = new RegExp('___AOSDELIM_' + pendingDelimId + '_(\\\\d+)___');
                    const m = buffer.match(delimRe);
                    if (m) {
                        const exitCode = parseInt(m[1], 10);
                        const raw = buffer.substring(0, m.index);
                        const lines = raw.split('\\n');
                        const cleaned = lines.filter(l =>
                            !l.includes('___AOSDELIM_') &&
                            !l.includes('printf ') &&
                            !l.includes('\\\\x00')
                        );
                        if (cleaned.length > 0 && cleaned[0].trim().startsWith('{')) {
                            cleaned.shift();
                        }
                        const stdout = cleaned.join('\\n')
                            .replace(/^\\n+/, '')
                            .replace(/\\n+$/, '');

                        buffer = buffer.substring(m.index + m[0].length);
                        const resolve = pendingResolve;
                        pendingResolve = null;
                        pendingDelimId = '';
                        resolve({
                            exitCode,
                            stdout: stdout ? stdout + '\\n' : '',
                            stderr: '',
                        });
                    }
                });

                const shellApi = {
                    exec: (cmd, opts) => {
                        if (opts && (opts.onStdout || opts.onStderr)) {
                            return self.kernel.exec(cmd, opts);
                        }
                        return new Promise((resolve) => {
                            cmdCounter++;
                            const delimId = 'c' + cmdCounter;
                            buffer = '';
                            pendingResolve = resolve;
                            pendingDelimId = delimId;
                            const payload = cmd + '\\nprintf "___AOSDELIM_' + delimId + '_%d___\\\\n" "$?"\\n';
                            self.writeShell(shellId, payload);
                        });
                    },
                };

                setTimeout(() => {
                    self.writeShell(shellId, 'echo ' + initDelim + '\\n');
                }, 50);

                setTimeout(() => {
                    if (!initialized) { initialized = true; resolveInit(null); }
                }, 15000);

            } catch (_e) { resolveInit(null); }
        });
    }`;

if (src.includes(originalExec)) {
  src = src.replace(originalExec, patchedExec);
} else {
  const pat = /    async exec$command, options$ \{\n        return this\.kernel\.exec$command, options$;\n    \}/;
  if (pat.test(src)) { src = src.replace(pat, patchedExec); }
  else { console.error("[patch] Could not find exec() to patch"); process.exit(1); }
}

fs.writeFileSync(file, src);
console.log("[patch] Patched exec() → persistent shell");
```

</details>

<details>
<summary>patch-exec-persistent-shell.sh — applies the patch to installed node_modules</summary>

```bash
#!/usr/bin/env bash
set -euo pipefail

DIST_FILE="$(find node_modules -path '*/@rivet-dev/agent-os-core/dist/agent-os.js' -type f | head -1)"

if [ -z "$DIST_FILE" ]; then
  echo "[patch] agent-os-core dist not found, skipping"; exit 0
fi
if grep -q '__persistentShell' "$DIST_FILE"; then
  echo "[patch] already patched, skipping"; exit 0
fi

echo "[patch] Patching $DIST_FILE ..."
node scripts/apply-patch.cjs "$DIST_FILE"
```

</details>

**Results with 500ms simulated spawn delay:**

| Metric | Before (original) | After (persistent shell) | Speedup |
|--------|-------------------|--------------------------|---------|
| echo ×15 | 8,194 ms | 269 ms | 30× |
| ls /workspace | 581 ms | 45 ms | 13× |
| pwd | 548 ms | 18 ms | 30× |

The persistent shell bypasses `kernel.exec()` after initialization, so the injected delay only affects the first spawn. This models environments where the bottleneck is in the spawn path (process creation + WASM binary I/O + JIT compile).

### Note on the PoC patch

This patch is a proof-of-concept to validate the approach. Known limitations:
- Output parsing heuristics (filtering lines containing `printf`, stripping lines starting with `{`) are fragile and could mishandle legitimate output
- No concurrent command support (single `pendingResolve`)
- stderr is always empty (not captured separately from the persistent shell)
- A production implementation should live in the kernel layer, not as a monkey-patch


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WASM shell commands spawn a new Node.js process + JIT compile per command #1440

Summary

Per-command execution flow

Why existing caches don't help

Impact in resource-constrained environments

Proposed fix: route exec through a persistent shell

Verification

Benchmark: baseline overhead per layer

Benchmark: simulated spawn latency (before/after persistent shell)

PoC patch: persistent shell for AgentOs.exec()

Note on the PoC patch

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Mechanism	What it caches	Helps with WASM compile?	Helps with process spawn?
Prewarm (`wasm.rs:527`)	Marker file to skip re-prewarm	No — compiled module discarded on `process.exit(0)` (`node_import_cache.rs:7891`)	No
Node.js compile cache (`runtime_support.rs:33`)	JS source → V8 bytecode	No — JS only, not WASM	No
Import cache (`NodeImportCache`)	Runner/loader files on disk	No	No

Operation	Time
`ls /workspace`	~5.3s
`mkdir -p /workspace`	~5.5s
`pwd`	~4.3s
15 commands total	~75s → ACP timeout at 120s

Metric	Before (original)	After (persistent shell)	Speedup
echo ×15	8,194 ms	269 ms	30×
ls /workspace	581 ms	45 ms	13×
pwd	548 ms	18 ms	30×

WASM shell commands spawn a new Node.js process + JIT compile per command #1440

Description

Summary

Per-command execution flow

Why existing caches don't help

Impact in resource-constrained environments

Proposed fix: route exec through a persistent shell

Verification

Benchmark: baseline overhead per layer

Benchmark: simulated spawn latency (before/after persistent shell)

PoC patch: persistent shell for AgentOs.exec()

Note on the PoC patch

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions