Skip to content

OpenShell runtime + manifest(execution) contract#282

Open
andri-coral wants to merge 8 commits into
feat/docker_hardeningfrom
feat/openshell_runtime
Open

OpenShell runtime + manifest(execution) contract#282
andri-coral wants to merge 8 commits into
feat/docker_hardeningfrom
feat/openshell_runtime

Conversation

@andri-coral

@andri-coral andri-coral commented May 11, 2026

Copy link
Copy Markdown
Collaborator

Why this matters

This PR adds kernel-level network mediation + filesystem ACLs + custom syscall filtering by adopting OpenShell plus a manifest contract for authors to declare egress needs.

Marketplace agents that opt into the openshell runtime get OPA-policed network egress (default-deny + declared external_hosts), Landlock filesystem ACLs, custom seccomp BPF, and irreversible UID drop through one bind-mounted ELF.

Runtime

OpenShellRuntime wraps the supervisor as PID 1 of each agent container. Caps required briefly (supervisor drops to UID 65532 after policy load): SYS_ADMIN, NET_ADMIN, SYS_PTRACE, SETUID, SETGID, DAC_READ_SEARCH. Supervisor bind-mount + per-session policy dir mount; inner entrypoint forwards to the agent command.

DockerLauncher object replaced with top-level launchDockerContainer(spec, ctx, onLogLine) in agent/runtime/DockerExecution.kt, shared between DockerRuntime and OpenShellRuntime. The onLogLine hook parses supervisor OCSF stdout and emits SessionEvent.EgressPolicyViolation on denial lines.

Manifest contract

[execution]
min_isolation = "container"
external_hosts = [ "api.github.com" ]

ExecutionPolicyResolver.validate(declared, policy, source, runtime, trust, openShellConfig) runs in GraphAgentRequest.toGraphAgent (preflight). Rejections wrap into AgentRequestException → HTTP 400 via StatusPages:

  • IsolationUnsupported — declared exceeds operator ceiling
  • IsolationIncompatibleWithRuntime — declared CONTAINER + runtime can't provide
  • HostDenied — declared external host hits operator deny / not in allow
  • SandboxUnavailable — runtime = OPENSHELL but supervisor null / not executable
  • RuntimeIncompatibleWithTrust — OPENSHELL + UID-pinned tier OR RO rootfs without /run tmpfs

Observability

SessionEvent.EgressPolicyViolation(agentName, protocol, host, port) broadcasts over the existing WS event flow. coral-studio gains a case 'egress_policy_violation' → toast.warning(...).

Coral-managed allowed_ips: /32 of the resolved dockerConfig.address for non-loopback IP literals; RFC1918 + fc00::/7 fallback for hostnames or loopback (Docker Desktop on macOS resolves host.docker.internal to IPv6 ULA inside containers).

Supervisor's default --log-level = warn hides OCSF lines; smoke config sets OPENSHELL_LOG_LEVEL = "info" via [debug.additionalDockerEnvironment].

Scope and known gaps

  • Out of scope credential brokering (openshell:resolve:env:* — Stage 3+ Tier 2); openshell_marketplace tier variant (UID-less profile that lets the supervisor own privilege-drop); studio Tier-3 panels for sandboxBackend + violation feed; persistent event sink; multi-host orchestration.
  • requireImageDigest still default-off — flips when marketplace signs digests.

Operator overrides

[openshell]
supervisor_path = "/usr/local/bin/openshell-sandbox"
expected_supervisor_version = "0.0.48"     # WARN-only on mismatch

[execution.marketplace]
max_supported_isolation = "container"
allowed_hosts = [ "api.github.com" ]
denied_hosts  = [ "metadata.google.internal", "169.254.169.254" ]

[debug.additionalDockerEnvironment]
OPENSHELL_LOG_LEVEL = "info"   # surface OCSF ALLOWED/DENIED lines

Agent images need iproute2, libcap2-bin, ca-certificates, nftables, a sandbox user, and a glibc ≥ 2.38 rootfs (debian:trixie / ubuntu:noble / distroless-glibc) — see resources/openshell/Dockerfile.example.

Smoke verification

192 / 192 unit tests pass. End-to-end smoke matrix on Apple Silicon + Docker Desktop (linux/arm64 supervisor built from source, NVIDIA 28e1ff7b):

# Scenario Result
T1 Happy path: agent → MCP → LLM proxy → OpenAI 200 ✓ 8 ALLOWED OCSF lines, session closed cleanly
T2 Denied egress (api.evil.example.com) ✓ OCSF NET:OPEN MED DENIED + EgressPolicyViolation event + agent 403
T3 Trust-tier gate (UID-pinned profile + OPENSHELL) ✓ HTTP 400 RuntimeIncompatibleWithTrust
T4 Supervisor path unset ✓ HTTP 400 SandboxUnavailable
T5 Supervisor path non-executable ✓ HTTP 400 SandboxUnavailable
T7 /32 allowed_ips for IP-literal docker.address ✓ rendered ["10.42.0.1/32"]
T8 RFC1918 + fc00::/7 fallback for hostname ✓ rendered correctly

Linux smoke on warwick

Verified on warwick Evidence
Landlock V2 active OCSF CONFIG:APPLYING … abi:V2 compat:BestEffortLandlock ruleset built [rules_applied:11 skipped:0]
/32 IPv4 allowed_ips under live traffic rendered allowed_ips: ["172.17.0.1/32"]; every MCP + LLM-proxy call logs [policy:coral_api engine:opa] ALLOWED
linux/amd64 supervisor end-to-end amd64 ELF runs T1+T2; OpenAI chat completions 200; T2 emits egress_policy_violation: NET:OPEN api.evil.example.com:443
nftables bypass-detection layer OCSF CONFIG:INSTALLED Bypass detection rules installed; agent's raw socket.connect(("93.184.216.34", 443))ConnectionRefusedError: [Errno 111] from kernel REJECT
supervisor anchored to v0.0.48 release bundled rego re-vendored from NVIDIA/openshell@v0.0.48 (+234 lines vs prior); binary verified openshell-sandbox 0.0.48; agent base bumped to debian:trixie-slim for glibc 2.38+
T2 OPA denial path verified on v0.0.48 rego with CORAL_T2_PROBE=1: supervisor emits OCSF NET:OPEN [MED] DENIED ... api.evil.example.com:443 [policy:- engine:opa]; coral-server emits egress_policy_violation: NET:OPEN api.evil.example.com:443; agent sees URLError: Tunnel connection failed: 403 Forbidden

@andri-coral andri-coral changed the base branch from master to feat/docker_hardening May 11, 2026 15:29
@andri-coral andri-coral changed the title openshell runtime OpenShell runtime + manifest [execution] contract May 12, 2026
@andri-coral andri-coral changed the title OpenShell runtime + manifest [execution] contract OpenShell runtime + manifest(execution) contract May 12, 2026
@andri-coral andri-coral marked this pull request as ready for review May 27, 2026 13:01
@andri-coral andri-coral requested review from CaelumF and seafraf May 27, 2026 13:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant