Skip to content

fix(sandbox): harden seccomp denylist, SSRF protection, and inference policy enforcement#819

Merged
johntmyers merged 4 commits intomainfrom
fix/tava-security-hardening-batch-1
Apr 13, 2026
Merged

fix(sandbox): harden seccomp denylist, SSRF protection, and inference policy enforcement#819
johntmyers merged 4 commits intomainfrom
fix/tava-security-hardening-batch-1

Conversation

@johntmyers
Copy link
Copy Markdown
Collaborator

Summary

Addresses the top 4 immediate-priority findings from the NemoClaw-OpenShell TAVA security architecture review (Table 8.7 FSRs). All changes are in openshell-sandbox — three files, surgical fixes with full test coverage.

Closes OS-56, closes OS-63, closes OS-58, closes OS-59, closes OS-61, closes OS-68

Changes

Seccomp hardening (seccomp.rs)

  • Remove seccomp skip in Allow mode: The apply() function previously returned early for NetworkMode::Allow, skipping all syscall filtering. Now unconditional blocks and conditional arg-based blocks always apply; only socket domain blocks remain conditional on network mode.
  • Block cross-process manipulation syscalls: Add process_vm_writev, pidfd_open, pidfd_getfd, pidfd_send_signal to the unconditional denylist, symmetric with existing ptrace and process_vm_readv blocks.
  • Block namespace/mount bypass syscalls: Add clone/clone3 with CLONE_NEWUSER flag (masked arg rules), new mount API syscalls (fsopen, fsconfig, fsmount, fspick, move_mount, open_tree), and namespace manipulation (setns, umount2, pivot_root).
  • Block kernel exploit primitives: Add userfaultfd and perf_event_open, consistent with Docker's default seccomp profile.
  • Refactored build_filter into build_filter_rules + compilation step so tests can inspect the rules map directly.

Inference policy enforcement (proxy.rs)

  • Fix keep-alive policy bypass: Changed else if !routed_any to unconditional else so a non-inference request on a keep-alive connection that previously routed inference traffic is denied and closed, rather than silently ignored.

SSRF protection (proxy.rs, mechanistic_mapper.rs)

  • Add CGNAT and special-use IP ranges: Both copies of is_internal_ip now block CGNAT 100.64.0.0/10 (RFC 6598), IETF protocol assignments 192.0.0.0/24 (RFC 6890), benchmarking 198.18.0.0/15 (RFC 2544), TEST-NET-2 198.51.100.0/24, and TEST-NET-3 203.0.113.0/24. Extracted shared is_internal_v4 helper to reduce duplication within each file.

Testing

  • All 416 existing sandbox tests pass
  • New tests added:
    • unconditional_blocks_present_in_filter — verifies all 21 unconditional syscall blocks
    • conditional_blocks_have_rules — verifies clone, clone3, unshare, execveat, seccomp conditional rules
    • test_rejects_ipv4_cgnat — CGNAT boundary tests (both accept and reject)
    • test_rejects_ipv4_special_use_ranges — all new special-use ranges
    • test_rejects_ipv6_mapped_cgnat — IPv4-mapped IPv6 CGNAT addresses
    • test_is_internal_ip_cgnat / test_is_internal_ip_special_use — mechanistic mapper equivalents

@johntmyers johntmyers requested a review from a team as a code owner April 13, 2026 16:46
@johntmyers johntmyers added area:sandbox Sandbox runtime and isolation work topic:security Security issues labels Apr 13, 2026
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Apr 13, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@johntmyers johntmyers force-pushed the fix/tava-security-hardening-batch-1 branch 2 times, most recently from 5d2f13d to 3fcf9ff Compare April 13, 2026 17:12
… policy enforcement

- Remove seccomp skip in NetworkMode::Allow so baseline syscall
  restrictions apply regardless of network mode
- Block cross-process manipulation syscalls (process_vm_writev,
  pidfd_open, pidfd_getfd, pidfd_send_signal) symmetric with existing
  ptrace and process_vm_readv blocks
- Block clone/clone3 with CLONE_NEWUSER flag, new mount API syscalls
  (fsopen, fsconfig, fsmount, fspick, move_mount, open_tree), and
  namespace manipulation (setns, umount2, pivot_root)
- Block userfaultfd and perf_event_open consistent with Docker default
  seccomp profile
- Deny and close keep-alive inference connections after a non-inference
  request instead of silently continuing the loop
- Add CGNAT (100.64.0.0/10), benchmarking (198.18.0.0/15), and other
  special-use IP ranges to SSRF protection in both proxy and
  mechanistic mapper
@johntmyers johntmyers force-pushed the fix/tava-security-hardening-batch-1 branch from 3fcf9ff to 84834ab Compare April 13, 2026 17:48
@johntmyers johntmyers marked this pull request as draft April 13, 2026 17:56
@johntmyers johntmyers self-assigned this Apr 13, 2026
@johntmyers johntmyers added the test:e2e Requires end-to-end coverage label Apr 13, 2026
@johntmyers johntmyers marked this pull request as ready for review April 13, 2026 18:07
@pimlock pimlock mentioned this pull request Apr 13, 2026
3 tasks
@johntmyers johntmyers merged commit 1cabd25 into main Apr 13, 2026
15 of 16 checks passed
@johntmyers johntmyers deleted the fix/tava-security-hardening-batch-1 branch April 13, 2026 19:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:sandbox Sandbox runtime and isolation work test:e2e Requires end-to-end coverage topic:security Security issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants