fix(distroless): router self-probe sweep + prompt-shields default + kind staleness + e2e gate (v0.1.13)#449
Merged
Conversation
…ind staleness + e2e gate The AL3 distroless move (#383) removed sh/curl/iptables from the controller/ inference-router/a2a/conformance images. Everything that exec'd those tools into the distroless inference-router broke on the real distroless path (AKS + kind --release): operator AGT/metrics panels, kars egress/policy/model/add/handoff. Fixes: - inference-router: new 'kars-inference-router probe [GET|POST] <path> [body]' subcommand — hits its own localhost:8443, reads the admin token internally (Authorization: Bearer), present in distroless router AND sandbox image. - CLI: replace every kubectl-exec curl/sh/wget/cat into -c inference-router with the probe binary (operator x8, egress, policy, model, add, handoff). Docker-mode execs (tool-rich sandbox container) unchanged. - prompt-shields default OFF + --require-prompt-shields opt-in (bare Foundry emits no prompt_filter_results -> prior default blocked every response). - dev/local-k8s: loadImageIntoKind verifies node image by ID + re-imports on mismatch — 'kars dev --release' now loads the real distroless images instead of a stale :dev (the bug that masked all of this locally). - e2e: test_sandbox_pod_starts asserts the sandbox POD passes the egress-guard init AND the router self-probe works — the regression gate that was missing. docker mode clean (single tool-rich container, router in-process). Verified: cargo build --release + probe smoke-test; CLI tsc + lint(0) + 821 tests; bash -n e2e. Security audit: docs/internal/security-audits/2026-06-24-distroless-router-probe-sweep.md Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
…or distroless gate The new test_sandbox_pod_starts gate caught a real gap: the e2e never loaded a sandbox image, so the egress-guard init (now on ctx.sandbox_image) hit ErrImagePull. Two fixes: - controller: egress-guard init container now sets imagePullPolicy: pull_policy (same as the agent container — both run the sandbox image). Neutral on AKS (Always for :latest), correct on kind (IfNotPresent) so a loaded sandbox image is authoritative instead of force-pulling ACR. - e2e: tests/e2e/Dockerfile.sandbox-stub (azurelinux+iptables, same base/backend as the production sandbox) loaded as kars-sandbox-e2e:dev; SANDBOX_IMAGE points at it so the gate runs the egress-guard's real iptables. Gate diagnostics now distinguish ErrImagePull (harness) from a tool break (the regression class). Security audit addendum appended (no control weakened). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Completes the distroless-move cleanup (#383) across AKS, local-k8s, and docker. The AL3 distroless images dropped
sh/curl/iptables; everykubectl execof those tools into the distroless inference-router broke on the real distroless path — operator AGT/metrics panels andkars egress|policy|model|add|handoff. (egress-guard init + controller probes were the earlier-fixed instances.)The unifying fix
New
kars-inference-router probe [GET|POST] <path> [json-body]subcommand: hits the router's own127.0.0.1:8443, reads the admin token internally (Authorization: Bearer), and is present in both the distroless router image and the sandbox image. Every CLI/operatorcurl/sh/wget/cat-into-distroless call now uses it. No tools added to any hardened image. Docker-mode execs (tool-rich sandbox container) unchanged.Also
--require-prompt-shieldsopt-in (bare Foundry emits noprompt_filter_results→ prior default blocked every response).loadImageIntoKindverifies the node image by ID and re-imports on mismatch —kars dev --releasenow actually loads the distroless images instead of a stale:dev(the bug that masked all of this locally).test_sandbox_pod_startsasserts the sandbox pod passes the egress-guard init AND the router self-probe works — the e2e previously only checked ns/NetworkPolicy/SA, never the pod, which is exactly why this class shipped uncaught.Verification
cargo build --release+ probe smoke-tested; CLItsc+ lint (0 errors) + 821 tests (incl. newrefs.test.ts);bash -ne2e clean. End-to-end distroless validation runs via the new e2e gate (Linux CI) andkars dev --releaseon a fresh kind.Security audit:
docs/internal/security-audits/2026-06-24-distroless-router-probe-sweep.md(2 sign-offs). No runtime security control weakened.Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com