Fix external detection job setup for codex and pi workflows#40954
Conversation
…etection feature Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
…detection feature Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
|
✅ Test Quality Sentinel completed test quality analysis. |
|
🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅ |
|
✅ Design Decision Gate 🏗️ completed the design decision gate check. No ADR enforcement needed: PR #40954 does not have the 'implementation' label and has 68 new lines in business logic directories (≤100 threshold). |
|
✅ PR Code Quality Reviewer completed the code quality review. |
There was a problem hiding this comment.
Pull request overview
Fixes a regression in gh-aw workflow compilation where the threat detection job could skip pre-pulling required AWF container images when using engine: codex with the external detector path enabled (features: gh-aw-detection: true), leading to runtime failures when AWF starts with --pull never.
Changes:
- Adjust detection job step generation to always emit the AWF image download step for the external detector path, while keeping the inline Codex path de-duplicated.
- Add regression tests covering Codex external detector behavior (and validating download-step de-duplication expectations).
- Recompile generated workflow lock files so detection jobs include the
Download container imagesstep before running detection.
Show a summary per file
| File | Description |
|---|---|
| pkg/workflow/threat_detection.go | Updates detection job step generation to include AWF image pre-pull for Codex when using the external detector path. |
| pkg/workflow/threat_detection_test.go | Adds regression coverage for Codex + external detector image download behavior. |
| .github/workflows/smoke-codex.lock.yml | Recompiled lockfile includes Download container images step in the detection job. |
| .github/workflows/smoke-claude-on-copilot.lock.yml | Recompiled lockfile updates detection job gating and generation output. |
| .github/workflows/smoke-call-workflow.lock.yml | Recompiled lockfile includes Download container images step in the detection job. |
| .github/workflows/daily-observability-report.lock.yml | Recompiled lockfile includes Download container images step in the detection job. |
| .github/workflows/daily-fact.lock.yml | Recompiled lockfile includes Download container images step in the detection job. |
| .github/workflows/daily-cache-strategy-analyzer.lock.yml | Recompiled lockfile includes Download container images step in the detection job. |
| .github/workflows/changeset.lock.yml | Recompiled lockfile includes Download container images step in the detection job. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 9/9 changed files
- Comments generated: 2
| if !strings.Contains(joined, "Download container images") { | ||
| t.Errorf("expected 'Download container images' step in codex external detector detection job steps\ngot:\n%s", joined) | ||
| } | ||
| if !strings.Contains(joined, "download_docker_images.sh") { | ||
| t.Errorf("expected 'download_docker_images.sh' in detection job steps\ngot:\n%s", joined) | ||
| } |
| t.Run("codex without gh-aw-detection emits exactly one container download (inline path via MCP setup)", func(t *testing.T) { | ||
| data := &WorkflowData{ | ||
| AI: "codex", | ||
| SafeOutputs: &SafeOutputsConfig{ | ||
| ThreatDetection: &ThreatDetectionConfig{}, | ||
| }, | ||
| Features: map[string]any{}, | ||
| SandboxConfig: &SandboxConfig{ | ||
| Agent: &AgentSandboxConfig{ | ||
| Type: SandboxTypeAWF, | ||
| }, | ||
| }, | ||
| } | ||
|
|
||
| steps := compiler.buildDetectionJobSteps(data) | ||
| joined := strings.Join(steps, "") | ||
|
|
||
| // For the inline codex path, MCP setup generation (inside buildDetectionEngineExecutionStep) | ||
| // emits the "Download container images" step exactly once. buildPullAWFContainersStep must | ||
| // NOT also emit it, or the step would appear twice and trip duplicate-step validation. | ||
| downloadCount := strings.Count(joined, "Download container images") | ||
| if downloadCount != 1 { | ||
| t.Errorf("expected exactly one 'Download container images' step for inline codex path, got %d\n%s", downloadCount, joined) | ||
| } | ||
| }) |
🧪 Test Quality Sentinel Report✅ Test Quality Score: 90/100 — Excellent
📊 Metrics & Test Classification (2 tests analyzed)
Go: 1 ( Notes:
Verdict
|
There was a problem hiding this comment.
Skills-Based Review 🧠
Applied /diagnose and /tdd — approving with two minor suggestions.
📋 Key Themes & Highlights
Key Themes
- Env-var test hermicity: the new inline-codex subtest doesn't guard against
GH_AW_FEATURES=gh-aw-detectionin the environment, which could cause spurious failures (see inline comment) - Minor duplication:
usingExternalDetectoris introduced at line 289 butisFeatureEnabledis called again raw at line 311 — reusing the variable would make the symmetry clearer
Positive Highlights
- ✅ Minimal, surgical fix — exactly two logic lines changed in production code
- ✅ Root cause properly diagnosed and documented: MCP setup is only called for the inline path, not the external detector path
- ✅ Both regression paths are covered: external detector gets the step, inline path emits it exactly once (no duplicate)
- ✅ Code comments in
buildDetectionJobStepsnow clearly describe why each path behaves differently - ✅ Excellent PR description with before/after snippet and full blast-radius accounting
🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 77.5 AIC · ⌖ 6.73 AIC · ⊞ 6.5K
Comments that could not be inline-anchored
pkg/workflow/threat_detection_test.go:1878
[/tdd] Missing environment guard: isFeatureEnabled checks the GH_AW_FEATURES env var in addition to frontmatter, so a runner with GH_AW_FEATURES=gh-aw-detection set will make usingExternalDetector true here, causing downloadCount > 1 and a spurious failure. The established pattern in this codebase is to guard with t.Setenv (see compiler_validators_test.go:140).
<details>
<summary>💡 Suggested fix</summary>
Add at the top of this sub-test:
t.Setenv("GH_AW_FEATURES", ""…
</details>
<details><summary>pkg/workflow/threat_detection.go:311</summary>
**[/diagnose]** `isFeatureEnabled(constants.GHAWDetectionFeatureFlag, data)` is called twice in this function — once at line 289 (stored in `usingExternalDetector`) and again here. Reusing the variable would be both more efficient and more expressive.
<details>
<summary>💡 Suggested change</summary>
```go
if usingExternalDetector {
// External detector path (features: gh-aw-detection: true)This also makes it easier to see at a glance that both branches of the step-selection logic ar…
There was a problem hiding this comment.
REQUEST_CHANGES — fix is logically correct, tests need hardening
The root cause analysis is accurate and the one-line logic change is the right fix. The lock file blast (6 workflows) is also correct — all affected workflows use engine: id: codex + gh-aw-detection: true and were silently broken before.
Blocking issues
Three medium issues prevent merge:
-
usingExternalDetectornot reused at line 311 —isFeatureEnabledis called twice for the same flag in the same function. The variable introduced at line 289 should be reused. -
External detector test: substring-only check, no count assertion — the test verifies the step exists but not that it appears exactly once. The project's own precedent (line 874) uses
strings.Countfor this step name. A duplicate emission would pass the current assertion undetected, which is exactly the class of bug the inline-path test was written to prevent. -
Engine override path not covered —
getThreatDetectionEngineIDcan return"codex"even whendata.AI != "codex"(viaSafeOutputs.ThreatDetection.EngineConfig.ID). No test exercises this path.
Non-blocking observation
smoke-claude-on-copilot.lock.yml detection job if condition changed (not mentioned in the PR description) — the compiled lock file dropped the (needs.agent.outputs.output_types != '' || needs.agent.outputs.has_patch == 'true') guard, leaving only always() && needs.agent.result != 'skipped'. This is correct per the compiler's own design (see lines 1388-1394 in threat_detection.go): the guard was stale and the detection_guard step handles the no-output case internally. But the PR description should mention this incidental behavioral change to avoid reviewer confusion.
🔎 Code quality review by PR Code Quality Reviewer · 98 AIC · ⌖ 7.43 AIC · ⊞ 5.1K
Comments that could not be inline-anchored
pkg/workflow/threat_detection.go:311
usingExternalDetector is computed at line 289 but not reused here — isFeatureEnabled is called a second time for the same flag in the same function pass, creating a silent inconsistency risk.
<details>
<summary>💡 Suggested fix</summary>
// Before (line 311)
if isFeatureEnabled(constants.GHAWDetectionFeatureFlag, data) {
// After — reuse the variable already computed above
if usingExternalDetector {The variable was introduced specifically for this purpose; not reusing it …
pkg/workflow/threat_detection_test.go:1866
External detector test only checks substring presence, not step count — for a fix specifically about preventing duplicate/missing steps, this assertion is too weak.
<details>
<summary>💡 Suggested fix</summary>
The project's own precedent (see line 874 in this file) uses strings.Count for this exact step name. The external detector subtest should do the same:
downloadCount := strings.Count(joined, "Download container images")
if downloadCount != 1 {
t.Errorf("expected exactly…
</details>
<details><summary>pkg/workflow/threat_detection_test.go:1838</summary>
**Tests only exercise the `data.AI = "codex"` path; the engine override path is untested** — `getThreatDetectionEngineID` prefers `SafeOutputs.ThreatDetection.EngineConfig.ID` over `data.AI`, so a workflow with main AI != "codex" but threat-detection engine override set to "codex" takes the same code path and is not covered.
<details>
<summary>💡 Why this matters</summary>
The fix touches the exact condition that calls `getThreatDetectionEngineID`. Any workflow that uses a non-codex main engi…
</details>|
✅ All tools validated successfully! Agent Container Smoke Test confirms agent container is ready. |
|
🚀 Smoke Antigravity MISSION COMPLETE! Antigravity has spoken. ✨ |
|
🚀 Smoke Pi MISSION COMPLETE! Pi delivered. 🥧 Caution agentic threat detected DetailsThe threat detection engine failed to produce results. Review the workflow run logs for details. |
|
🎬 THE END — Smoke Claude MISSION: ACCOMPLISHED! The hero saves the day! ✨ |
|
✨ The prophecy is fulfilled... Smoke Codex has completed its mystical journey. The stars align. 🌟 Caution agentic threat detected DetailsThe threat detection engine failed to produce results. Review the workflow run logs for details. |
|
🚀 Smoke Pi MISSION COMPLETE! Pi delivered. 🥧 |
|
✨ The prophecy is fulfilled... Smoke Codex has completed its mystical journey. The stars align. 🌟 Caution agentic threat detected DetailsThe threat detection engine failed to produce results. Review the workflow run logs for details. |
|
🎬 THE END — Smoke Claude MISSION: ACCOMPLISHED! The hero saves the day! ✨ |
|
🚀 Smoke Antigravity MISSION COMPLETE! Antigravity has spoken. ✨ |
|
🚀 Smoke Gemini MISSION COMPLETE! Gemini has spoken. ✨ |
|
✅ All tools validated successfully! Agent Container Smoke Test confirms agent container is ready. |
|
📰 BREAKING: Smoke Copilot is now investigating this pull request. Sources say the story is developing... |
|
📰 BREAKING: Smoke Copilot - AOAI (apikey) is now investigating this pull request. Sources say the story is developing... |
|
📰 BREAKING: Smoke Copilot - AOAI (Entra) is now investigating this pull request. Sources say the story is developing... |
Agent Container Tool Check
Result: 12/12 tools available ✅ Overall status: PASS
|
Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "localhost"See Network Configuration for more information.
|
|
Caution agentic threat detected DetailsThe threat detection engine failed to produce results. Review the workflow run logs for details. Comment MemoryNote This comment is managed by comment memory.It stores persistent context for this thread in the code block at the top of this comment. Warning Firewall blocked 6 domainsThe following domains were blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "accounts.google.com"
- "android.clients.google.com"
- "clients2.google.com"
- "contentautofill.googleapis.com"
- "safebrowsingohttpgateway.googleapis.com"
- "www.google.com"See Network Configuration for more information.
|
🤖 Smoke Test: Claude — Run 28033618374Core #1-12: ✅ all passed Overall: PARTIAL (all executed tests passed, 2 skipped) 🎉 Warning Firewall blocked 6 domainsThe following domains were blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "accounts.google.com"
- "android.clients.google.com"
- "clients2.google.com"
- "contentautofill.googleapis.com"
- "safebrowsingohttpgateway.googleapis.com"
- "www.google.com"See Network Configuration for more information.
|
There was a problem hiding this comment.
💥 Automated smoke test review - all systems nominal!
Warning
Firewall blocked 6 domains
The following domains were blocked by the firewall during workflow execution:
accounts.google.comandroid.clients.google.comclients2.google.comcontentautofill.googleapis.comsafebrowsingohttpgateway.googleapis.comwww.google.com
To allow these domains, add them to the
network.allowedlist in your workflow frontmatter:
network:
allowed:
- defaults
- "accounts.google.com"
- "android.clients.google.com"
- "clients2.google.com"
- "contentautofill.googleapis.com"
- "safebrowsingohttpgateway.googleapis.com"
- "www.google.com"See Network Configuration for more information.
💥 [THE END] — Illustrated by Smoke Claude · 70.9 AIC · ⌖ 32.5 AIC · ⊞ 1K
| @@ -0,0 +1,5 @@ | |||
| --- | |||
| "gh-aw": patch | |||
There was a problem hiding this comment.
Nice, scoping this as a patch change is appropriate for a bug fix. 👍
| "gh-aw": patch | ||
| --- | ||
|
|
||
| Fixed codex threat-detection workflows so the AWF container download step is still emitted when `gh-aw-detection` is enabled. |
There was a problem hiding this comment.
Clear changeset summary describing the AWF container download fix. Consider linking the related issue for traceability.
There was a problem hiding this comment.
Smoke review done. Tiny notes inline.
Warning
Firewall blocked 6 domains
The following domains were blocked by the firewall during workflow execution:
accounts.google.comandroid.clients.google.comclients2.google.comcontentautofill.googleapis.comsafebrowsingohttpgateway.googleapis.comwww.google.com
To allow these domains, add them to the
network.allowedlist in your workflow frontmatter:
network:
allowed:
- defaults
- "accounts.google.com"
- "android.clients.google.com"
- "clients2.google.com"
- "contentautofill.googleapis.com"
- "safebrowsingohttpgateway.googleapis.com"
- "www.google.com"See Network Configuration for more information.
📰 BREAKING: Report filed by Smoke Copilot · 352.3 AIC · ⌖ 13.6 AIC · ⊞ 19.2K
When
features: gh-aw-detection: trueandengine: codex, the detection job skipped pulling AWF container images, causingdocker compose up -d --pull neverto fail with "No such image" forsquidandapi-proxy.This PR also fixes the
engine: piexternal detector path to use the Copilot agentic engine for the detection job, sincethreat-detectdoes not support invoking Pi directly.Additionally, the external Codex detector path now prepares the minimal MCP/Codex config files that
threat-detect --engine codexexpects, so Codex no longer exits withCODEX_HOME points to "/tmp/gh-aw/mcp-config", but that path does not exist.Root cause
buildDetectionJobStepsunconditionally skippedbuildPullAWFContainersStepfor codex, assuming MCP setup would emit the download step. That assumption only holds for the inline path —buildDetectionEngineExecutionStep(which callsgenerateMCPSetup) is never invoked on the external detector path (gh-aw-detection: true).Separately, the external detector path reused the main workflow engine selection as-is, so Pi workflows compiled detection jobs that invoked
threat-detect --engine pi, even though the external detector only supports the built-in gh-aw agentic engines.For Codex, the external detector path also skipped the inline Codex MCP/config bootstrap entirely, so the detection job invoked
threat-detect --engine codexwithout the writableCODEX_HOMEdirectory and config files that Codex expects.Changes
pkg/workflow/threat_detection.go— guard the codex skip on external detector being disabled:pkg/workflow/threat_detection.go— add external-detector engine resolution so Pi workflows fall back to Copilot for the detection job, and reuse threat-detection engine config only when it still matches the external detector engine.pkg/workflow/threat_detection.go— add a Codex-only external-detector setup step that creates an empty MCP servers config plus writableconfig.tomlfiles forCODEX_HOME, sothreat-detect --engine codexhas the minimal config bootstrap it expects.pkg/workflow/threat_detection_test.go— regression test covering both sub-cases: codex +gh-aw-detection: true(download step present) and codex inline path (exactly one download step from MCP setup, no duplicate).pkg/workflow/threat_detection_isolation_test.go— add regression coverage for:engine: pi+gh-aw-detection: true, asserting the compiled detection job installs Copilot, invokesthreat-detect --engine copilot, and does not install or invoke Pi.engine: codex+gh-aw-detection: true, asserting the compiled detection job prepares the empty MCP servers config and writable Codex config files before external detection runs.Recompiled affected Pi workflow lock files — Pi workflows that use
gh-aw-detection: true(for examplesmoke-pi.lock.yml) now compile detection jobs that install the Copilot CLI and runthreat-detect --engine copilot, while the codex lock files continue to include theDownload container imagesstep before AWF starts.✨ PR Review Safe Output Test - Run 28030295088
Warning
Firewall blocked 6 domains
The following domains were blocked by the firewall during workflow execution:
accounts.google.comandroid.clients.google.comclients2.google.comcontentautofill.googleapis.comsafebrowsingohttpgateway.googleapis.comwww.google.comSee Network Configuration for more information.
✨ PR Review Safe Output Test - Run 28033618374
Warning
Firewall blocked 6 domains
The following domains were blocked by the firewall during workflow execution:
accounts.google.comandroid.clients.google.comclients2.google.comcontentautofill.googleapis.comsafebrowsingohttpgateway.googleapis.comwww.google.comSee Network Configuration for more information.