Describe the feature or problem you'd like to solve
The CLI exits 0 in CI even when the agent had zero tools available at startup or terminated before producing any output. In Actions / CI consumers (Agentic Workflows in our case) the exit code is the contract; a green exit on a no-tools run looks like success and downstream if: success() jobs run as if the agent worked.
Proposed solution
Make copilot exit non-zero in this failure mode so CI consumers can trust the exit code. Two shapes work for us, pick whichever fits the CLI's design:
CI=true-conditional behavior: when CI=true, exit non-zero if any configured MCP server fails to start. Matches the convention npm, yarn, jest already follow for stricter machine-readable defaults; preserves the interactive UX.
- Explicit flag (e.g.
--strict-mcp-start): clearer opt-in, easier to roll out without changing existing behavior.
Either way, the benefit is that any CI consumer (not just AW) can rely on the exit code as the primary signal instead of parsing logs after the fact.
Example prompts or workflows
- MCP allowlist enforcement fails closed in a sandboxed runner. v1.0.22 enabled enforcement that calls
https://api.github.com/copilot/mcp_registry directly at startup. In Agentic Workflows the agent container has no auth token (all api.github.com access flows through an api-proxy sidecar) and api.github.com isn't on the egress allowlist, so the request is blocked, the CLI fails closed on every MCP server, and the agent runs with an empty tool surface. v1.0.22 still exited 0, the Actions job came back green, and we didn't detect the regression — a customer did (gh-aw#25550, gh-aw#25680).
- Broader shape of the same ask: any configured MCP server failing to start (config error, network blip, server-side outage, partial allowlist mismatch). We haven't been bitten by a partial-MCP-failure scenario yet, but losing one MCP server can degrade the agent's tool surface in subtle ways, and we'd rather have a non-zero exit than silently run with fewer tools.
Additional context
- Customer reports from the v1.0.22 incident: gh-aw#25550, gh-aw#25680
- Customer-facing run showing the silent-success pattern:
devantler-tech/ksail run 24270819213
- Kusto fingerprint we used after the fact: skipped
detection jobs in AW (whose if: checks the agent job's result) jumped from a baseline of 8 to 40 per day to 2,569 on 04-10. Distinct repos jumped from 10 to 20 per day to ~438. That signal was only visible in our telemetry, not in the exit code.
Describe the feature or problem you'd like to solve
The CLI exits 0 in CI even when the agent had zero tools available at startup or terminated before producing any output. In Actions / CI consumers (Agentic Workflows in our case) the exit code is the contract; a green exit on a no-tools run looks like success and downstream
if: success()jobs run as if the agent worked.Proposed solution
Make
copilotexit non-zero in this failure mode so CI consumers can trust the exit code. Two shapes work for us, pick whichever fits the CLI's design:CI=true-conditional behavior: whenCI=true, exit non-zero if any configured MCP server fails to start. Matches the conventionnpm,yarn,jestalready follow for stricter machine-readable defaults; preserves the interactive UX.--strict-mcp-start): clearer opt-in, easier to roll out without changing existing behavior.Either way, the benefit is that any CI consumer (not just AW) can rely on the exit code as the primary signal instead of parsing logs after the fact.
Example prompts or workflows
https://api.github.com/copilot/mcp_registrydirectly at startup. In Agentic Workflows the agent container has no auth token (allapi.github.comaccess flows through an api-proxy sidecar) andapi.github.comisn't on the egress allowlist, so the request is blocked, the CLI fails closed on every MCP server, and the agent runs with an empty tool surface. v1.0.22 still exited 0, the Actions job came back green, and we didn't detect the regression — a customer did (gh-aw#25550, gh-aw#25680).Additional context
devantler-tech/ksailrun 24270819213detectionjobs in AW (whoseif:checks the agent job's result) jumped from a baseline of 8 to 40 per day to 2,569 on 04-10. Distinct repos jumped from 10 to 20 per day to ~438. That signal was only visible in our telemetry, not in the exit code.