- primary accent color: #38BDF8 (bright sky blue), light variant: #7DD3FC
- secondary accent color: #CC0000 (red), light variant: #FF3333
- website: https://secureexec.dev
- docs: https://secureexec.dev/docs
- GitHub: https://github.com/rivet-dev/secure-exec
- GitHub org is
rivet-dev— NEVER useanthropicsor any other org in GitHub URLs for this repo - the docs slug for Node.js compatibility is
nodejs-compatibility(notnode-compatabilityor other variants)
- every publishable package must include a
README.mdwith the standard format: title, tagline, and links to website, docs, and GitHub - if
package.jsonhas a"files"array,"README.md"must be listed in it - no hardcoded monorepo/pnpm paths — NEVER resolve dependencies at runtime using hardcoded relative paths into
node_modules/.pnpm/or monorepo-relative../../../node_modules/walks; usecreateRequire(import.meta.url).resolve("pkg/path")or standard Node module resolution instead - no phantom transitive dependencies — if published runtime code calls
require.resolve("foo")orimport("foo"),fooMUST be declared in that package'sdependencies(not just available transitively in the monorepo) filesarray must cover all runtime references — if compileddist/code resolves paths outsidedist/at runtime (e.g.,../src/polyfills/), those directories MUST be listed in the"files"array; verify withpnpm pack --jsonornpm pack --dry-runbefore publishing
- NEVER mock external services in tests — use real implementations (Docker containers for databases/services, real HTTP servers for network tests, real binaries for CLI tool tests)
- tests that validate sandbox behavior MUST run code through the secure-exec sandbox (NodeRuntime/proc.exec()), never directly on the host
- CLI tool tests (Pi, Claude Code, OpenCode) must execute inside the sandbox: Pi runs as JS in the VM, Claude Code and OpenCode spawn their binaries via the sandbox's child_process.spawn bridge
- for host-binary CLI/SDK regressions (Claude Code, OpenCode), pair the sandbox
child_process.spawn()probe with a directkernel.spawn()control for the same binary command; if direct kernel command routing works but sandboxed spawn hangs, the blocker is in the Node child_process bridge path rather than the tool binary, provider config, or HostBinaryDriver mount - sandbox
child_process.spawn()does not yet honorstdiooption semantics for host-binary commands, so headless CLI tests that need EOF on stdin should explicitly callchild.stdin.end()instead of assumingstdio: ['ignore', ...]will close it - real-provider CLI/SDK tool-integration tests must stay opt-in via an explicit env flag and load credentials at runtime from exported env vars or
~/misc/env.txt; never commit secrets or replace the live provider path with a mock redirect when the story requires real traffic - real-provider NodeRuntime CLI/tool tests that need a mutable temp worktree must pair
moduleAccesswith a real host-backed base filesystem such asnew NodeFileSystem();moduleAccessalone makes projected packages readable but leaves sandbox tools unable to touch/tmpworking files - e2e-docker fixtures connect to real Docker containers (Postgres, MySQL, Redis, SSH/SFTP) — skip gracefully via
skipUnlessDocker()when Docker is unavailable - interactive/PTY tests must use
kernel.openShell()with@xterm/headless, not host PTY viascript -qefc - before fixing a reported runtime, CLI, SDK, or PTY bug, first reproduce the broken state and capture the exact visible output (stdout, stderr, event payloads, or terminal screen) in a regression or work note; do not start by guessing at the fix
- terminal-output and PTY-rendering bugs must use snapshot-style assertions against exact strings or exact screen contents under fixed rows/cols, not loose substring checks
- if expected terminal behavior is unclear, run the same flow on the host as a control and compare the sandbox transcript/screen against that host output before deciding what to fix
- be liberal with structured debug logging for complex interactive or long-running sessions so later manual repros can be diagnosed from artifacts instead of memory
- debug logging for complex sessions should go to a separate sink that does not contaminate stdout/stderr protocol output; prefer structured
pinologs with enough context to reconstruct process lifecycle, PTY events, command routing, and failures, while redacting secrets - kernel blocking-I/O regressions should be proven through
packages/core/test/kernel/kernel-integration.test.tsusing real process-owned FDs viaKernelInterface(fdWrite,flock,fdPollWait) rather than only manager-level unit tests - inode-lifetime/deferred-unlink kernel integration tests must use
InMemoryFileSystem(or another inode-aware VFS) and await the kernel's POSIX-dir bootstrap; the defaultcreateTestKernel()TestFileSystemdoes not exercise inode-backed FD lifetime semantics - kernel signal-handler regressions should use a real spawned PID plus
KernelInterface.processTable/KernelInterface.socketTable; unitProcessTablecoverage alone does not prove pending delivery orSA_RESTARTbehavior through the live kernel - socket-table unit tests that call
listen()or other host-visible network operations must provide an explicitnetworkCheckfixture; barenew SocketTable()now models deny-by-default networking and will reject listener setup withEACCES - kernel UDP transport stories should include a real
packages/secure-exec/tests/kernel/case that builds acreateKernel()instance withcreateNodeHostNetworkAdapter()and realnode:dgrampeers; socket-table unit tests alone do not prove host-backed datagram routing - socket option/flag stories should pair
packages/core/test/kernel/coverage with a realpackages/secure-exec/tests/kernel/case across TCP, AF_UNIX, and UDP; when proving host-backed option replay, wrapcreateNodeHostNetworkAdapter()and recordHostSocket.setOption()calls instead of relying on public@secure-exec/coreexports forSOL_SOCKET/TCP_NODELAY/MSG_*constants /proc/selfcoverage must run through a process-scoped runtime such askernel.spawn('node', ...)orcreateProcessScopedFileSystem; raw kernelvfscalls have no caller PID context and cannot prove live/proc/selfbehavior
- no test-only workarounds — if a C override fixes broken libc behavior (fcntl, realloc, strfmon, etc.), it MUST go in the patched sysroot in
~/agent-os-registry/so all WASM programs get the fix; never link overrides only into test binaries — that inflates conformance numbers while real users still hit the bug - never replace upstream test source files — if an os-test
.cfile fails due to a platform difference (e.g.sizeof(long)), exclude it viaos-test-exclusions.jsonwith the real reason; do not swap in a rewritten version that changes what the test validates - kernel behavior belongs in the kernel, not the test runner — if a test requires runtime state (POSIX directories like
/tmp,/usr, device nodes, etc.), implement it in the kernel/device-layer so all users get it; the test runner should not create kernel state that real users won't have - no suite-specific VFS special-casing — the test runner must not branch on suite name to inject different filesystem state; if a test needs files to exist, either the kernel should provide them or the test should be excluded
- categorize exclusions honestly — if a failure is fixable with a patch or build flag, it's
implementation-gap, notwasm-limitation; reservewasm-limitationfor things genuinely impossible in wasm32-wasip1 (no 80-bit long double, no fork, no mmap)
- conformance tests live in
packages/secure-exec/tests/node-conformance/— they are vendored upstream Node.js v22.14.0 test/parallel/ tests run through the sandbox - vendored Node conformance helper shims live in
packages/secure-exec/tests/node-conformance/common/; if a WPT-derived vendored test fails on a missing../common/*helper, add the minimal harness/shim there instead of rewriting the vendored test file docs-internal/nodejs-compat-roadmap.mdtracks every non-passing test with its fix category and resolution- when implementing bridge/polyfill features where both sides go through our code (e.g., loopback HTTP server + client), prevent overfitting:
- wire-level snapshot tests: capture raw protocol bytes and compare against known-good captures from real Node.js
- project-matrix cross-validation: add a project-matrix fixture (
tests/projects/) using a real npm package that exercises the feature — the matrix compares sandbox output to host Node.js - real-server control tests: for network features, maintain tests that hit real external endpoints (not loopback) to validate the client independently of the server
- mismatch-preserving verification: if the control path currently fails, keep the host-vs-sandbox check and assert the concrete mismatch (
stderr, exit code, missing bytes) instead of deleting the test or replacing it with another same-code-path loopback check - known-test-vector validation: for crypto, validate against NIST/RFC test vectors — not just round-trip verification
- error object snapshot testing: for ERR_* codes, snapshot-test full error objects (code, message, constructor) against Node.js — not just check
.codeexists - host-side assertion verification: periodically run assert-heavy conformance tests through host Node.js to verify the assert polyfill isn't masking failures
- for kernel-consolidation stories, tests that instantiate
createDefaultNetworkAdapter()oruseDefaultNetworkare legacy compatibility coverage only; completion claims must be backed bycreateNodeRuntime()mounted into a realKernel - never inflate conformance numbers — if a test self-skips (exits 0 without testing anything), mark it
vacuous-skipin expectations.json, not as a real pass - reserve
category: "vacuous-skip"forexpected: "pass"self-skips only; if a vendored file staysexpected: "skip"because behavior is still broken, keep a real failure category likeimplementation-gapso report category totals stay honest - every entry in
expectations.jsonmust have a specific, verifiable reason — no vague "fails in sandbox" reasons - every non-pass conformance expectation must also resolve to exactly one implementation-intent bucket (
implementable,will-not-implement, orcannot-implement) via the shared classifier inpackages/secure-exec/tests/node-conformance/expectation-utils.ts; keep the generated report aligned with that breakdown - when rerunning a single expected-fail conformance file through
runner.test.ts, a green Vitest result only means the expectation still matches; only the explicitnow passes! Remove its expectationfailure proves the vendored test itself now passes and the entry is stale - before deleting explicit
passoverrides behind a negated glob, rerun the exact promoted vendored files through a directcreateTestNodeRuntime()harness or another no-expectation path; broad module cleanup can still hide stale passes - after changing expectations.json or adding/removing test files, regenerate both the JSON report and docs page:
pnpm tsx scripts/generate-node-conformance-report.ts - the script produces
packages/secure-exec/tests/node-conformance/conformance-report.json(machine-readable) anddocs/nodejs-conformance-report.mdx(docs page) — commit both - to run the actual conformance suite:
pnpm vitest run packages/secure-exec/tests/node-conformance/runner.test.ts - raw
net.connect()traffic to sandboxhttp.createServer()is implemented entirely inpackages/nodejs/src/bridge/network.ts; when fixing loopback HTTP behavior, re-run the vendored pipeline/transfer files (test-http-get-pipeline-problem.js,test-http-pipeline-requests-connection-leak.js,test-http-transfer-encoding-*.js,test-http-chunked-304.js) because they all exercise the same parser/serializer path - For callback-style
fsbridge methods, do Node-style argument validation before entering the callback/error-delivery wrapper; otherwise invalid args that should throw synchronously get converted into callback errors or Promise returns and vendored fs validation coverage goes red
- use pnpm, vitest, and tsc for type checks
- use turbo for builds
- after changing
packages/core/isolate-runtime/src/inject/require-setup.tsor Node bridge code that regenerates the isolate bundle, rebuild in this order:pnpm --filter @secure-exec/nodejs buildthenpnpm --filter @secure-exec/core build; the conformance runner executes builtdistoutput, not just source files - keep timeouts under 1 minute and avoid running full test suites unless necessary
- use one-line Conventional Commit messages; never add any co-authors (including agents)
- never mark work complete until typechecks pass and all tests pass in the current turn; if they fail, report the failing command and first concrete error
- always add or update tests that cover plausible exploit/abuse paths introduced by each feature or behavior change
- treat host memory buildup and CPU amplification as critical risks; avoid unbounded buffering/work (for example, default in-memory log buffering)
- check GitHub Actions test/typecheck status per commit to identify when a failure first appeared
- do not use
contractin test filenames; use names likesuite,behavior,parity,integration, orpolicyinstead
packages/dev-shell/is the canonical interactive sandbox for manual validation of the runtime surface- VERY IMPORTANT: the dev shell must never use host-backed command overrides or host-binary fallbacks for manual validation; if a tool is present there, it must run through the sandbox-native runtime path
- if a tested tool does not yet have a real sandbox-native path, leave it unavailable in the dev shell and track the gap instead of silently routing to the host
- when adding a new tested CLI tool or runtime surface, update
packages/dev-shell/in the same change so developers can reproduce and inspect it interactively inside the sandbox - keep the dev shell honest with focused end-to-end coverage, including at least one interactive PTY/TUI path that runs entirely inside the sandbox
- when fixing a bug or implementation gap tracked by a GitHub issue, close the issue in the same PR using
gh issue close <number> --comment "Fixed in <commit-hash>" - when removing a test from
os-test-exclusions.jsonorlibc-test-exclusions.jsonbecause the fix landed, close the linked issue - do not leave resolved issues open — verify with
gh issue view <number>if unsure
- NEVER implement a from-scratch reimplementation of a tool when the PRD specifies using an existing upstream project (e.g., codex, curl, git, make)
- always fork, vendor, or depend on the real upstream source — do not build a "stub" or "demo" binary that fakes the tool's behavior
- if the upstream cannot compile or link for the target, document the specific blockers and leave the story as failing — do not mark it passing with a placeholder
- the PRD and story notes define which upstream project to use; follow them exactly unless explicitly told otherwise
- NEVER commit third-party C library source code directly into this repo
- modified libraries (e.g., libcurl with WASI patches) must live in a fork under the
rivet-devGitHub org (e.g.,rivet-dev/secure-exec-curl) - all WASM command source code, C programs, and sysroot builds now live in
~/agent-os-registry/(GitHub:rivet-dev/agent-os-registry) - existing forks:
rivet-dev/secure-exec-curl(libcurl withwasi_tls.candwasi_stubs.c)
- the goal for WasmVM is full POSIX compliance 1:1 — every command, syscall, and shell behavior should match a real Linux system exactly
- WasmVM and Python are experimental surfaces in this repo
- all docs for WasmVM, Python, or other experimental runtime features must live under the
Experimentalsection of the docs navigation, not the main getting-started/reference sections - The
native/wasmvm/directory has been deleted from this repo. All WASM command source code (Rust crates, C programs, WASI host import definitions, patches, and the C sysroot build) now lives in~/agent-os-registry/(GitHub:rivet-dev/agent-os-registry). Build from the registry:cd ~/agent-os-registry && make build-wasm. - the WasmVM runtime driver (
packages/wasmvm/) still lives in this repo. It loads and executes WASM binaries but does not contain command source code. - tests gated behind
skipIf(!hasWasmBinaries)orskipUnlessWasmBuilt()will skip locally if binaries aren't built
- the WASM command source code (including
wasi-ext, C programs, patches, and Makefiles) now lives in~/agent-os-registry/(GitHub:rivet-dev/agent-os-registry) - every function in the
host_processandhost_userimport modules (declared inwasi-extin the registry) must have at least one C parity test exercising it through libc - when adding a new host import, add a matching test case to the registry's syscall_coverage.c and its parity test in
packages/wasmvm/test/c-parity.test.ts - the canonical source of truth for import signatures is
wasi-ext/src/lib.rsin the registry. C patches and JS host implementations must match exactly. - C patches in the registry's
patches/wasi-libc/must be kept in sync with wasi-ext. ABI drift between C, Rust, and JS is a P0 bug. - permission tier enforcement must cover ALL write/spawn/kill/pipe/dup operations — audit
packages/wasmvm/src/kernel-worker.tswhen adding new syscalls PATCHED_PROGRAMSin the registry's C Makefile must include all programs that usehost_processorhost_userimports (programs linking the patched sysroot)- WasmVM
host_netsocket option payloads cross the worker RPC boundary as little-endian byte buffers; decode/encode them inpackages/wasmvm/src/driver.tsand keeppackages/wasmvm/src/kernel-worker.tsas a thin memory marshal layer
- use
docs-internal/glossary.mdfor canonical definitions of isolate, runtime, bridge, and driver
- read
docs-internal/arch/overview.mdfor the component map (NodeRuntime, RuntimeDriver, NodeDriver, NodeExecutionDriver, ModuleAccessFileSystem, Permissions) - keep it up to date when adding, removing, or significantly changing components
- keep host bootstrap polyfills in
packages/nodejs/src/execution-driver.tsaligned with isolate bootstrap polyfills inpackages/core/isolate-runtime/src/inject/require-setup.ts; drift in shared globals likeAbortControllercauses sandbox-only behavior gaps that source-level tests can miss - WHATWG globals that sandbox code touches before any bridge module loads (
TextDecoder,TextEncoder,Event,CustomEvent,EventTarget) must be fixed in both bootstrap layers andpackages/nodejs/src/bridge/polyfills.ts; bridge-only fixes do not change the globals seen by directruntime.run()/runtime.exec()code - bridged
fetch()request serialization must normalizeHeadersinstances before crossing the JSON bridge; passing the host a rawHeadersobject silently drops auth and SDK-specific headers because it stringifies to{} - sandbox stdout/stderr write bridges must preserve Node's callback semantics even for empty writes like
process.stdout.write('', cb); headless CLI tools use that zero-byte callback as a flush barrier before clean exit - exec-mode scripts that depend on bridge-delivered child-process/stdio callbacks must keep the same
Executealive on_waitForActiveHandles(); once the native V8 session returns fromExecute, laterStreamEventmessages sent to that idle session thread are ignored - When a builtin or
internal/*module needs sandbox-specific behavior but still has to work through CommonJSrequire(), add it underpackages/nodejs/src/polyfills/and register it inpackages/nodejs/src/polyfills.tsCUSTOM_POLYFILL_ENTRY_POINTS; that keeps esbuild bundling it to CJS instead of letting the isolate loader choke on raw ESMexportsyntax - vendored fs abort tests deep-freeze option bags via
common.mustNotMutateObjectDeep(), so sandboxAbortSignalstate must live outside writable instance properties; freezing{ signal }must not break latercontroller.abort() - vendored
common.mustNotMutateObjectDeep()helpers must skip populated typed-array/DataView instances;Object.freeze(new Uint8Array([1]))throws before the runtime under test executes, which turns option-bag immutability coverage into a harness failure - when adding bridge globals that the sandbox calls with
.apply(..., { result: { promise: true } }), register them in the native V8 async bridge list innative/v8-runtime/src/session.rs; otherwise the_loadPolyfillshim can turn a supposed async wait into a synchronous deadlock - bridged
net.Server.listen()must makeserver.address()readable immediately afterlisten()returns, even before the'listening'callback, because vendored Node tests read ephemeral ports synchronously - bridged Unix path sockets (
server.listen(path),net.connect(path)) must route through kernelAF_UNIX, not TCP validation, andreadableAll/writableAlllistener options must update the VFS socket-file mode bits thatfs.statSync()observes - bridged
net.Socket.setTimeout()must match Node validation codes (ERR_INVALID_ARG_TYPE,ERR_OUT_OF_RANGE) and any timeout timer created for an unrefed socket must also be unrefed so it cannot keep the runtime alive by itself - standalone
NodeExecutionDrivershould always provision an internalSocketTablefor loopback routing, but it must only attachcreateNodeHostNetworkAdapter()whenSystemDriver.networkis explicitly configured; omitted network capability must not silently re-enable host TCP access - bridged
dgram.Socketloopback semantics depend on both layers: the isolate bridge must implicitly bind unbound sender sockets beforesend(), and the kernel UDP path must rewrite wildcard local addresses (0.0.0.0/::) to concrete loopback source addresses sorinfo.addressmatches Node on self-send/echo tests - bridged
dgram.Socketbuffer-size options must be cached untilbind()completes; Node expects unboundget*BufferSize()/set*BufferSize()calls to throwERR_SOCKET_BUFFER_SIZEwithEBADF, so eager pre-bind application hides the real error path packages/wasmvm/src/driver.tspreferspackages/wasmvm/dist/kernel-worker.jswhen no siblingsrc/kernel-worker.jsexists, so edits topackages/wasmvm/src/kernel-worker.tsare not authoritative untilpnpm --filter @secure-exec/wasmvm buildrefreshes the worker bundle- bridged
http2server streams must start paused on the host and only resume when sandbox code opts into flow (req.on('data'),req.resume(), orstream.resume()); otherwise the host consumes DATA frames too early, sends WINDOW_UPDATE unexpectedly, and hides paused flow-control / pipeline regressions - vendored
http2nghttp2 error-path tests patchinternal/test/bindingHttp2Stream.prototype.respond; keep that shim wired to the same bridge-facingHttp2Stream/internal/http2/util.NghttpErrorconstructors the runtime uses, or the tests stop exercising the real wrapper logic - bridge exports that userland constructs with
newmust be assigned as constructable function properties, not object-literal method shorthands; shorthand methods likecreateReadStream() {}are not constructable and vendored fs coverage callsnew fs.createReadStream(...) /proc/sys/kernel/hostnameconformance hits both kernel-backed and standalone NodeRuntime paths; a procfs fix that only lands in the kernel layer still leavescreateTestNodeRuntime()fs/FileHandle coverage red- require-transformed ESM must not rely on the CommonJS wrapper's
__filename/__dirnameparameter names; keep wrapper internals on private names, synthesize local CJS bindings only for plain CommonJS sources, and compute transformedimport.meta.urlfrompathToFileURL(__secureExecFilename).href ModuleAccessFileSystemmust treat host-absolute package asset paths derived fromimport.meta.url,__filename, orrealpath()as part of the same read-only projectednode_modulesclosure when they canonicalize inside the configured overlay; Pi and similar SDKs walk to siblingpackage.json/README/theme assets that wayModuleAccessFileSystemalso has to include pnpm virtual-store dependency symlink targets reachable from projected packages; package-internalimportslike Chalk's#ansi-stylesresolve into those sibling.pnpm/*/node_modules/*targets rather than staying under the top-level package root
- all sandbox I/O routes through the virtual kernel — user code never touches the host OS directly
- the kernel provides: VFS (virtual file system), process table (spawn/signals/exit), network stack (TCP/HTTP/DNS/UDP), and a deny-by-default permissions engine
- network calls are kernel-mediated:
http.createServer()registers a virtual listener in the kernel's network stack;http.request()to localhost routes through the kernel without real TCP — the kernel connects virtual server to virtual client directly; external requests go through the host adapter after permission checks - kernel network deny-by-default is enforced in
packages/core/src/kernel/socket-table.ts, soKernelImplmust passoptions.permissions?.networkintoSocketTableand external socket paths must callcheckNetworkPermission()unconditionally; loopback exemptions belong in the routing branch, not in global permission bypasses AF_UNIXsockets are local IPC, not host networking:SocketTablebind/listen/connect for path sockets must stay fully in-kernel, bypasspermissions.network, and only use the VFS/listener registry for reachability and socket-file state- kernel-owned
SocketTableinstances must validate owner PIDs against the shared process table at allocation time; only standalone/internal socket tables should omit that validator - when kernel
bind()assigns an internal ephemeral port forport: 0, preserve that original ephemeral intent on the socket so external host-backed listeners can still call the host adapter withport: 0and then rewritelocalAddrto the real host-assigned port - the VFS is not the host file system — files written by sandbox code live in the VFS (ChunkedVFS by default); host filesystem is accessible only through explicit read-only overlays (e.g.,
node_modules) configured by the embedder - the default in-memory VFS is
ChunkedVFS(InMemoryMetadataStore + InMemoryBlockStore)created viacreateInMemoryFileSystem()from@secure-exec/core. The old monolithicInMemoryFileSystemclass was removed. - deferred unlink must stay inode-backed: once a pathname is removed, new path lookups must fail immediately, but existing FDs must keep working through
FileDescription.inodeuntil the last reference closes KernelInterface.fdOpen()is synchronous, so open-time file semantics (O_CREAT,O_EXCL,O_TRUNC) must go through sync-capable VFS hooks threaded through the device and permission wrappers — do not move those checks into async read/write paths- embedders provide host adapters that implement actual I/O — a Node.js embedder provides real
fsandnet; a browser embedder providesfetch-based networking and no file system; sandbox code doesn't know which adapter backs the kernel - when implementing new I/O features (e.g., UDP, TCP servers, fs.watch), they MUST route through the kernel — never bypass it to hit the host directly
- see
docs/nodejs-compatibility.mdxfor the architecture diagram
The VFS uses a layered chunked architecture: VirtualFileSystem (kernel interface) is implemented by ChunkedVFS, which composes FsMetadataStore (directory tree, inodes, chunk mapping) + FsBlockStore (dumb key-value blob store).
- ChunkedVFS (
packages/core/src/vfs/chunked-vfs.ts): composes a metadata store and block store into a fullVirtualFileSystem. Created viacreateChunkedVfs(options). - Tiered storage: files <=
inlineThreshold(default 64 KB) are stored inline in the metadata store. Larger files are split into fixed-size chunks (default 4 MB) in the block store. Automatic promotion/demotion when files cross the threshold. - Per-inode async mutex: prevents interleaved read-modify-write corruption on concurrent async operations (pwrite, writeFile, truncate, removeFile, rename). Read-only ops (pread, readFile, stat) do not acquire the mutex.
- Optional write buffering: when
writeBuffering: true, pwrite buffers dirty chunks in memory and flushes on fsync or auto-flush interval. Reads always see buffered data. - Optional versioning: when
versioning: true, block keys include a random ID to avoid overwrites. ExposescreateVersion,listVersions,restoreVersion,pruneVersionsAPI. - Block key format:
{ino}/{chunkIndex}(or{ino}/{chunkIndex}/{randomId}with versioning).
- InMemoryMetadataStore (
packages/core/src/vfs/memory-metadata.ts): pure JS Map-based. For ephemeral VMs and tests. - SqliteMetadataStore (
packages/core/src/vfs/sqlite-metadata.ts): SQLite-backed viabetter-sqlite3. Supports versioning. Constructor accepts{ dbPath }where:memory:creates an in-memory database. - InMemoryBlockStore (
packages/core/src/vfs/memory-block-store.ts): pure JS Map-based. - HostBlockStore (
packages/core/src/vfs/host-block-store.ts): persists blocks as files on the host filesystem. For local dev environments. - S3BlockStore (in agent-os
packages/fs-s3/): S3-compatible object storage. Server-side copy support.
- The kernel delegates
pwriteto the VFS interface instead of doing read-modify-write internally. - The kernel calls
vfs.fsync?.(path)fire-and-forget inreleaseDescriptionInodewhen the last FD for a file is closed. - All kernel I/O goes through the
VirtualFileSysteminterface only. The oldInMemoryFileSystem-specific fast paths (readFileByInode,preadByInode,writeFileByInode,statByInode) andrawInMemoryFsfield were removed. fdOpenis synchronous. Open-time semantics (O_CREAT,O_EXCL,O_TRUNC) useprepareOpenSyncon the VFS for sync-capable handling.
Three conformance test suites are exported from @secure-exec/core for external VFS implementations to validate against:
- VFS conformance (
packages/core/src/test/vfs-conformance.ts): tests the fullVirtualFileSystemcontract. Register withdefineVfsConformanceTests({ name, createFs, cleanup, capabilities }). Capability flags gate optional test groups:symlinks,hardLinks,permissions,utimes,truncate,pread,pwrite,mkdir,removeDir,fsync,copy,readDirStat. - Block store conformance (
packages/core/src/test/block-store-conformance.ts): tests theFsBlockStorecontract. Register withdefineBlockStoreTests({ name, createStore, capabilities }). Capability flag:copy. - Metadata store conformance (
packages/core/src/test/metadata-store-conformance.ts): tests theFsMetadataStorecontract. Register withdefineMetadataStoreTests({ name, createStore, capabilities }). Capability flag:versioning.
Test registration files go in packages/core/test/vfs/. Use small thresholds (e.g., 256 bytes inline, 1024 bytes chunk) for fast edge case tests.
- NEVER use regex-based source code transformation for JavaScript/TypeScript (e.g., converting ESM to CJS, rewriting imports, extracting exports)
- regex transformers break on multi-line syntax, code inside strings/comments/template literals, and edge cases like
import X, { a } from 'Y'— these bugs are subtle and hard to catch - instead, use proper tooling:
es-module-lexer/cjs-module-lexer(the same WASM-based lexers Node.js uses), or run the transformation inside the V8 isolate where the JS engine handles parsing correctly - if a source transformation is needed at the bridge/host level, prefer a battle-tested library over hand-rolled regex
- the V8 runtime already has dual-mode execution (
execute_scriptfor CJS,execute_modulefor ESM) — lean on V8's native module system rather than pre-transforming source on the host side - existing regex-based transforms (e.g.,
convertEsmToCjs,transformDynamicImport,isESM) are known technical debt and should be replaced; whenrequire()compatibility needsimport.meta.url, inject an internal file-URL helper instead of rewriting to the wrapper__filename
.agent/contracts/contains behavioral contracts — these are the authoritative source of truth for runtime, bridge, permissions, stdlib, and governance requirements- ALWAYS read relevant contracts before implementing changes in contracted areas (runtime, bridge, permissions, stdlib, test structure, documentation)
- when a change modifies contracted behavior, update the relevant contract in the same PR so contract changes are reviewed alongside code changes
- for secure-exec runtime behavior, target Node.js semantics as close to 1:1 as practical
- any intentional deviation from Node.js behavior must be explicitly documented in the relevant contract and reflected in compatibility/friction docs
- track development friction in
docs-internal/friction.md(mark resolved items with fix notes) - see
.agent/contracts/README.mdfor the full contract index
- the interactive shell (brush-shell via WasmVM) and kernel process model must match POSIX behavior unless explicitly documented otherwise
node -e <code>must produce stdout/stderr visible to the user, both throughkernel.exec()and in the interactive shell PTY — identical to runningnode -eon a real Linux terminalnode -e <invalid>must display the error (SyntaxError/ReferenceError) on stderr, not silently swallow it- commands that only read stdin when stdin is a TTY (e.g.
tree,catwith no args) must not hang when run from the shell; commands must detect whether stdin is a real data source vs an empty pipe/PTY - blocking pipe writes must preserve partial progress and wait for new capacity via the kernel wait path; wake blocked writers from both read drains and read-end closes so pipe writes never hang after the consumer disappears
- Ctrl+C (SIGINT) must interrupt the foreground process group within 1 second, matching POSIX
isig+VINTRbehavior — this applies to all runtimes (WasmVM, Node, Python) - PTY bulk writes in raw mode must still apply
icrnlatomically before buffer-limit checks; oversized writes must fail withEAGAINwithout partially buffering input - signal delivery through the PTY line discipline → kernel process table → driver kill() chain must be end-to-end tested
- when adding or fixing process/signal/PTY behavior, always verify against the equivalent behavior on a real Linux system
- compatibility fixtures live under
packages/secure-exec/tests/projects/and MUST be black-box Node projects (package.json+ source entrypoint) - fixtures MUST stay sandbox-blind: no sandbox-only branches, no sandbox-specific entrypoints, and no runtime tailoring in fixture code
- secure-exec runtime MUST stay fixture-opaque: no behavior branches by fixture name/path/test marker
- the matrix runs each fixture in host Node and secure-exec and compares normalized
code,stdout, andstderr - no known-mismatch classification is allowed; parity mismatches stay failing until runtime/bridge behavior is fixed
- the Tested Packages section in
docs/nodejs-compatibility.mdxlists all packages validated via the project-matrix test suite - when adding a new project-matrix fixture, add the package to the Tested Packages table
- when removing a fixture, remove the package from the table
- the table links to GitHub Issues for requesting new packages to be tracked
tests/test-suite/{node,python}.test.tsare integration suite drivers;tests/test-suite/{node,python}/hold the shared suite definitions- test suites test generic runtime functionality with any pluggable SystemDriver (exec, run, stdio, env, filesystem, network, timeouts, log buffering); prefer adding tests here because they run against all environments (node, browser, python)
tests/runtime-driver/tests behavior specific to a single runtime driver (e.g. Node-onlymemoryLimit/timingMitigation, Python-only warm state orsecure_exechooks) that cannot be expressed through the shared suite context- within
test-suite/{node,python}/, files are named by domain (e.g.runtime.ts,network.ts)
Follow the style in packages/secure-exec/src/index.ts.
- use short phase comments above logical blocks
- explain intent/why, not obvious mechanics
- keep comments concise and consistent (
Set up,Transform,Wait for,Get) - comment tricky ordering/invariants; skip noise
- add inline comments and doc comments when behavior is non-obvious, especially where runtime/bridge/driver pieces depend on each other
README.mdmirrors the landing page (packages/website/src/pages/index.astroand its components) 1:1 in content and structure- when updating the landing page (hero copy, features, benchmarks, comparison, FAQ, or CTA), update
README.mdto match - when updating
README.md, update the landing page to match - the landing page section order is: Hero → Code Example → Why Secure Exec (features) → Benchmarks → Secure Exec vs. Sandboxes → FAQ → CTA → Footer
- all public-facing docs (quickstart, guides, API reference, landing page, README) must focus on the Node.js runtime as the primary and default experience — do not lead with WasmVM, kernel internals, or multi-runtime concepts
- code examples in docs should use the
NodeRuntimeAPI (runtime.run(),runtime.exec()) as the default path; the kernel API (createKernel,kernel.spawn()) is for advanced multi-process use cases and should be presented as secondary - WasmVM and Python docs are experimental docs and must stay grouped under the
Experimentalsection indocs/docs.json - docs pages that must stay current with API changes:
docs/quickstart.mdx— update when core setup flow changesdocs/api-reference.mdx— update when any public export signature changesdocs/runtimes/node.mdx— update when NodeRuntime options/behavior changesdocs/runtimes/python.mdx— update when PythonRuntime options/behavior changesdocs/system-drivers/node.mdx— update when createNodeDriver options changedocs/system-drivers/browser.mdx— update when createBrowserDriver options changedocs/nodejs-compatibility.mdx— update when bridge, polyfill, or stub implementations change; keep the Tested Packages section current when adding or removing project-matrix fixturesdocs/cloudflare-workers-comparison.mdx— update when secure-exec capabilities change; bump "Last updated" datedocs/posix-compatibility.md— update when kernel, WasmVM, Node bridge, or Python bridge behavior changes for any POSIX-relevant feature (signals, pipes, FDs, process model, TTY, VFS)docs/wasmvm/supported-commands.md— update when adding, removing, or changing status of WasmVM commands; keep summary counts current
docs-internal/todo.mdis the active backlog — keep it up to date when completing tasks- when adding new work, add it to todo.md
- when completing work, mark items done in todo.md
- create project skills in
.claude/skills/ - expose Claude-managed skills to Codex via symlinks in
.codex/skills/