Skip to content

fix(sandbox): egress-guard init uses sandbox image (has iptables), not distroless router (v0.1.12)#448

Merged
pallakatos merged 1 commit into
mainfrom
fix/egress-guard-distroless-image
Jun 24, 2026
Merged

fix(sandbox): egress-guard init uses sandbox image (has iptables), not distroless router (v0.1.12)#448
pallakatos merged 1 commit into
mainfrom
fix/egress-guard-distroless-image

Conversation

@pallakatos

Copy link
Copy Markdown
Collaborator

Critical: every sandbox is stuck Init:CrashLoopBackOff on current-release AKS

Confirmed on a live cluster:

egress-guard  StartError exit=128
failed to create containerd task: ... exec: "sh": executable file not found in $PATH

The egress-guard init container runs sh -c "iptables ..." to install the per-pod egress lockdown, but it was using the inference-router image, which became AL3 distroless (no sh, no iptables) in #383 — the same distroless migration that broke the controller probes (fixed in v0.1.10). The init can't start → no sandbox ever runs.

Fix

Use ctx.sandbox_image for the egress-guard init. The sandbox base image installs iptables + util-linux and ships a shell (sandbox-images/openclaw/Dockerfile.base:171), and it's already pulled on the node (it's the agent container) — no new image, no extra pull. It's also the same image whose entrypoint does the equivalent iptables setup in Docker mode, so K8s now matches Docker.

Safety

Privilege envelope unchanged (runAsUser:0, NET_ADMIN+NET_RAW, seccomp Unconfined, all else dropped) and the iptables ruleset is byte-identical. The guard was previously completely non-functional (fail-closed: nothing ran); this restores it.

Verification

  • Root cause confirmed live (StartError, egress-guard on kars-inference-router:latest; sandbox base installs iptables).
  • cargo build --release + full controller test suite pass.
  • Will re-verify live after the v0.1.12 controller image rolls out.

Security audit: docs/internal/security-audits/2026-06-24-egress-guard-distroless-image.md (2 sign-offs).

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

…les), not the distroless router

Every sandbox pod was stuck Init:CrashLoopBackOff on current-release AKS:

  egress-guard StartError exit=128
  exec: "sh": executable file not found in $PATH

The egress-guard init runs 'sh -c "iptables ..."' but was using the
inference-router image, which became AL3 distroless (no sh, no iptables) in
#383 — the same distroless migration that broke the controller probes. So the
init crashlooped and no agent ever started.

Fix: use ctx.sandbox_image for the egress-guard init. The sandbox base image
installs iptables + util-linux and has a shell (Dockerfile.base:171), and is
already pulled on the node (it's the agent container). Same image whose
entrypoint does the equivalent iptables setup in Docker mode.

Privilege envelope + iptables ruleset unchanged. Verified root cause on live
cluster; cargo build --release + controller tests pass.

Security audit: docs/internal/security-audits/2026-06-24-egress-guard-distroless-image.md

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions

Copy link
Copy Markdown

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

@pallakatos pallakatos merged commit 90a8aaa into main Jun 24, 2026
32 checks passed
@pallakatos pallakatos deleted the fix/egress-guard-distroless-image branch June 24, 2026 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant