Skip to content

feat(prover): add native Rust policy prover with Z3 solver#741

Open
zredlined wants to merge 11 commits intomainfrom
feat/699-policy-prover-rust/zredlined
Open

feat(prover): add native Rust policy prover with Z3 solver#741
zredlined wants to merge 11 commits intomainfrom
feat/699-policy-prover-rust/zredlined

Conversation

@zredlined
Copy link
Copy Markdown
Collaborator

@zredlined zredlined commented Apr 2, 2026

Summary

Native Rust implementation of the OpenShell Policy Prover (OPP) using Z3 SMT solving. Answers two questions about any sandbox policy:

  1. Can data leave this sandbox? — finds exfiltration paths through uninspected (L4-only) endpoints and wire protocol bypasses
  2. Can the agent write despite read-only intent? — finds write operations that bypass read-only policy through binary capabilities and credential scopes

openshell policy prove is now a native CLI command — no Python, no subprocess, no PYTHONPATH.

Supersedes #703 (Python prototype). Closes #699.

Related Issue

Closes #699

Changes

New crate: crates/openshell-prover/

  • model.rs — Z3 constraint model encoding policy rules, binary capabilities, and credential scopes as boolean satisfiability constraints
  • queries.rs — two verification queries (exfiltration, write bypass)
  • policy.rs — policy YAML parser with intent detection, L7 enforcement helpers, workdir modeling
  • registry.rs — binary capability registry (git, curl, node, claude, etc.) embedded at compile time via include_dir!
  • credentials.rs — credential and API capability loading
  • accepted_risks.rs — risk acceptance matching by query + host
  • report.rs — terminal and compact (CI) output
  • finding.rs — finding types and risk levels (HIGH, CRITICAL only)
  • lib.rs — public prove() API + 7 tests

Registry data (crates/openshell-prover/registry/):

  • 9 binary capability descriptors, 1 API descriptor (GitHub)
  • Embedded into the binary at compile time; --registry CLI flag overrides with a filesystem path for custom registries

CLI integration (crates/openshell-cli/src/main.rs):

  • openshell policy prove calls openshell_prover::prove() directly

Dependencies (Cargo.toml):

  • z3 = "0.19" — links against system libz3 via pkg-config
  • include_dir = "0.7" — compile-time registry embedding
  • glob = "0.3" — replaces hand-rolled glob matching

Build: system Z3 vs bundled

CI uses system libz3-dev (pre-installed in the CI image) so Z3 never compiles from source. sccache only wraps rustc — not the cmake C++ build z3-sys runs — so the bundled feature caused ~30 min CI builds on every cache miss. With the system library, Z3 link time is always zero.

Local development: install Z3 via brew install z3 (macOS) or apt install libz3-dev (Linux). For environments without system Z3, an opt-in feature compiles from source:

cargo build -p openshell-prover --features bundled-z3

Review feedback addressed:

  • Port types changed from u32 to u16 to match runtime validation (prevents silently accepting >65535)
  • include_workdir serde default changed from true to false to match runtime behavior (openshell-policy uses #[serde(default)])
  • Registry loading switched from fragile CARGO_MANIFEST_DIR/CWD filesystem paths to compile-time embedding
  • Hand-rolled glob matching replaced with glob crate
  • Removed unused deps: openshell-policy, openshell-core, thiserror, tracing

Testing

  • cargo test -p openshell-prover --features bundled-z3 — 7/7 tests pass
  • cargo check -p openshell-cli --features openshell-prover/bundled-z3 — CLI compiles
  • End-to-end: openshell policy prove --compact detects exfil + write bypass on test fixture

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)
  • SPDX license headers on all new files
  • Z3 prerequisite documented in CONTRIBUTING.md

@zredlined zredlined self-assigned this Apr 2, 2026
@zredlined zredlined requested a review from a team as a code owner April 2, 2026 20:32
@zredlined zredlined force-pushed the feat/699-policy-prover-rust/zredlined branch from 89af989 to f8a78cf Compare April 4, 2026 04:40
@zredlined
Copy link
Copy Markdown
Collaborator Author

@NVIDIA/openshell-codeowners The Rust checks for PR #741 are failing because z3-sys needs libclang for bindgen, and our ghcr.io/nvidia/openshell/ci:latest image doesn’t include it. I can patch branch-checks.yml to install clang/libclang-dev in the Rust jobs for an immediate unblock, but I think we should also update deploy/docker/Dockerfile.ci and republish the CI image as the proper fix. Do folks prefer the workflow-level unblock first, or do we want to fix the CI image before rerunning checks?

@johntmyers
Copy link
Copy Markdown
Collaborator

@NVIDIA/openshell-codeowners The Rust checks for PR #741 are failing because z3-sys needs libclang for bindgen, and our ghcr.io/nvidia/openshell/ci:latest image doesn’t include it. I can patch branch-checks.yml to install clang/libclang-dev in the Rust jobs for an immediate unblock, but I think we should also update deploy/docker/Dockerfile.ci and republish the CI image as the proper fix. Do folks prefer the workflow-level unblock first, or do we want to fix the CI image before rerunning checks?

I'd make sure it's in the image.

@copy-pr-bot

This comment was marked as off-topic.

@mjamiv
Copy link
Copy Markdown

mjamiv commented Apr 8, 2026

We run 4 OpenClaw AI agent sandboxes on a single OpenShell host, each with its own network policy (~24 endpoints, post-Phase 1 security hardening). Right now we audit for exfiltration paths manually — cross-referencing L4-only endpoints, binary capability sets, and credential scopes across four policy files is tedious and error-prone.

A native openshell policy prove command is exactly what's missing from our workflow. The two query model (exfil path + write bypass) maps directly onto the two threat classes we care most about: data leaving a sandbox through an uninspected tunnel, and an agent acquiring write access beyond its intended scope. Being able to run this in CI on every policy change would close the gap cleanly.

We'd be happy to be early testers once it ships — our policy set is a reasonable real-world fixture (multiple sandboxes, shared host, per-agent binary allowlists, split REST/WebSocket policies). Happy to share sanitized policy YAML if it's useful for the registry or test corpus.

zredlined added 10 commits April 9, 2026 14:40
Add openshell-prover crate implementing formal policy verification
using Z3 SMT solving. Answers two questions about any sandbox policy:
"Can data leave?" and "Can the agent write despite read-only intent?"

Native Rust — no Python subprocess, no PYTHONPATH, no uv dependency.
Z3 bundled via z3-sys for self-contained builds.

Replaces the Python prototype from #703.

Closes #699

Signed-off-by: Alexander Watson <zredlined@gmail.com>
…ntent

Port two fixes from the Python branch:
- Exfil query skips endpoints where L7 is enforced and working
- Write bypass only fires on explicit read-only intent, not L4-only

Signed-off-by: Alexander Watson <zredlined@gmail.com>
z3-sys requires libclang at build time for bindgen to generate FFI
bindings. Without it, the Rust CI jobs fail on the prover crate.
Add libz3-dev to the CI image and drop the z3 `bundled` feature from
the workspace dependency. This eliminates the ~30 min z3 C++ build
that ran on every CI cache miss.

sccache only wraps rustc — it does not intercept the cmake C++ build
that z3-sys runs when `bundled` is enabled. The cargo target cache
helped on warm runs but evicts on toolchain bumps and fork PRs.
With the system library pre-installed in the CI image, z3 link time
is always zero.

A `bundled-z3` opt-in feature is added to the prover crate for local
development without system z3:

  cargo build -p openshell-prover --features bundled-z3

Regular local dev: brew install z3 (macOS) or apt install libz3-dev
(Linux), then cargo build just works.

Signed-off-by: Alexander Watson <zredlined@gmail.com>
Made-with: Cursor
…ntime

Port fields changed from u32 to u16 across all prover types (policy,
model, finding, queries). Prevents the prover from silently accepting
port values >65535 that the runtime rejects, which would produce
misleading PASS results on invalid policies.

Change include_workdir serde default from true to false to match the
runtime (openshell-policy uses #[serde(default)] which gives false).
The previous mismatch caused the prover to model a /sandbox path that
does not exist at runtime, producing false analysis.

Signed-off-by: Alexander Watson <zredlined@gmail.com>
Made-with: Cursor
… glob

Embed binary and API capability registry YAML files into the binary at
compile time using include_dir!. The previous approach used
env!("CARGO_MANIFEST_DIR") which bakes in the build machine's source
path — works in tests but breaks for installed binaries. The CWD
fallback was equally fragile.

The --registry CLI flag still works as a filesystem override for custom
registries. Credentials remain filesystem-loaded (user-supplied data).

Replace the hand-rolled glob matching (~60 lines) with the glob crate's
Pattern::matches(), which is already a transitive dependency.

Remove unused dependencies: openshell-policy, openshell-core, thiserror,
tracing. The prover parses policy YAML directly into its own types and
does not use the policy or core crate APIs.

Signed-off-by: Alexander Watson <zredlined@gmail.com>
Made-with: Cursor
The prover crate now links against system libz3 instead of compiling
from source. Document the install steps for macOS, Ubuntu, and Fedora,
and note the bundled-z3 feature flag as a fallback.

Signed-off-by: Alexander Watson <zredlined@gmail.com>
Made-with: Cursor
@zredlined zredlined force-pushed the feat/699-policy-prover-rust/zredlined branch from 42bb028 to 1b1a99e Compare April 9, 2026 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Add formal policy verification (OpenShell Policy Prover) to Python SDK

3 participants