[audit-workflows] Daily Agentic Workflow Audit — 2026-06-02 (afternoon window, 89.4% success, no 429) #36500

2026-06-02T17:39:52Z

github-actions[bot]
Bot Jun 2, 2026

Daily Agentic Workflow Audit — 2026-06-02 (afternoon window)

This is a second, afternoon re-audit of 2026-06-02 covering completed runs from 14:31Z–17:23Z, complementing this morning's full-day audit (#36398, 23:42Z 06-01 → 04:15Z 06-02). Of the 24h log set, 47 runs completed with full summaries in this window (plus 2 in-progress, including this audit).

Headline: a healthier slice. Success recovered to 89.4% (42/47), back in the stable ~89–91% band, and the dominant token-budget 429 mode was absent this window — the heaviest run (UK AI Operational Resilience, 23.79M effective tokens) finished comfortably under the 25M cap. The 5 failures are spread across 5 distinct classes, with 3 of them infra/experimental/non-agent rather than a single recurring defect.

Summary

Metric	Value	Note
Completed runs	47 (+2 in-progress)	afternoon window only
Success rate	89.4% (42/47)	▲ from morning's 84.2%
Failures	5 (5 distinct classes)	no single dominant mode
Tokens / Effective	15.6M / 144.8M	partial window
Cost (claude-measured)	$6.27	copilot/codex EstimatedCost=0
Turns	345	—
Firewall blocked	21.2% (586/2768)	smoke/PR-reviewer hotspots
Missing tools / data / MCP failures	0 / 0 / 0	clean
Engines	copilot 40, claude 6, codex 1	—

Critical Issues

🟢 token-budget-429 did NOT recur (was the morning's top class, #35661). No Maximum effective tokens exceeded / isMaxEffectiveTokensExceededError=true anywhere in this window. Heaviest: UK AI Operational Resilience 23.79M < 25M cap (SUCCESS). This is the second window-level reprieve, but the morning window had two over-cap failures, so the cap remains intermittently hit by heavy daily-aggregation workflows. Fix (fail-fast on the signature + chunk heavy workflows) stays warranted.

🟠 safe-output partial-failure-intolerance persists — new "count-exceeded" variant. CLI Consistency Checker 26826617790 (copilot) emitted 3× create_issue against a cap of 1 → 2 validation errors (Too many items of type 'create_issue'. Maximum allowed: 1) → agent step exit 1 → run failure, even though the safe_outputs job succeeded and created the 1 allowed issue. This joins the existing variants (target=*, target=triggering, no-number, transient ERR_API). The whole family shares one root: a safe-output validation problem red-fails the run despite useful output landing.

Other Failures (infra / experimental / non-agent)

3 infra/experimental failures + 1 super-linter job

🆕 runner-infra · node missing (exit 127) — Daily Issues Report Generator 26831132205 (copilot, schedule). The copilot_harness guard fired: "node runtime missing on this runner — check runtimes.node in workflow YAML; exit 127". A pre-agent infra failure (runner lacked a usable node binary), not agent logic. Single occurrence — monitor for recurrence pointing at runtimes.node / runner image.
🆕 agent-timeout · no output — Avenger 26831111185 (claude/opus-4-8, schedule) ran 48.4 min (15:47:48→16:36:13Z) actively calling the opus API and dispatching Bash + TaskOutput, then failed with Turns=0, ErrorCount=0, empty output. Looks like the agent-step time limit hit while still working (possibly a stuck/blocking TaskOutput poll). Single occurrence — if it recurs, cap turns/runtime.
experimental copilot-sdk · session.idle timeout — PR Code Quality Reviewer 26827796547 (copilot) on experimental branch copilot/ab-advisor-experiment-campaign-max-turns hit Timeout after 600000ms waiting for session.idle (10 min hard, no output, not retried). A timeout variant of the experimental copilot-sdk family from this morning's auth-failure class — still dev-iteration, not a production regression. Recommend gating these experimental workflows from scheduled/PR triggers until the SDK session lifecycle is stable.
super-linter job (non-agent) — Super Linter Report 26829252826 (copilot, schedule): the super_linter job failed and the agent step was skipped. Non-agentic lint-job failure; low audit relevance.

Trends (30-day available history)

Workflow health has held a stable 85–96% success band across the window, with two visible dips: 06-23's outlier 41.6% (a large batch-failure day) and a softer 81% trough on 05-28. The recent days — 91.1% (05-30), 89.8% (05-31), 96.2% peak (06-01), 84.2% (06-02 full-day) — show normal day-to-day variance rather than a regression; this afternoon's 89.4% slice sits right in the healthy band.

Token usage swings between ~15M and ~69M tokens/day, with the 7-day moving average holding around 40–45M. The 05-31 spike to 68.8M (a heavy refactor/aggregation day) is the band's ceiling; daily totals have eased since, and the afternoon slice (15.6M partial) is consistent with a lighter back-half of the day.

Firewall

Blocked 21.2% (586/2768) — by-design smoke/probe and PR-reviewer telemetry. Hotspots: CLI Consistency Checker 71/303 (23%), PR Code Quality Reviewer 64/254 (25%), UK AI Operational Resilience 43/183 (23%). No firewall block caused any of the 5 run failures.

Recommendations

(High) Extend safe-output tolerance to the count-exceeded case: when a capped type is over-emitted, keep the first N allowed items and emit a warning instead of failing the agent step. Pairs with the existing partial-failure-tolerance fix.
(High, [aw-failures] Token-budget exhaustion (25M effective-tokens cap) recurring across 6+ scheduled workflows — 2026-05-29 02:00–07:32 UTC #35661) Keep the token-budget-429 fail-fast + heavy-workflow chunking work open — the cap is intermittent, not resolved.
(Medium) Gate the experimental copilot-sdk / ab-advisor max-turns workflows from scheduled/PR triggers until the SDK session lifecycle (auth + idle) is stable, so dev-iteration noise stops polluting failure metrics.
(Medium) Add a pre-flight node check in activation so a missing node runtime fails at setup with an actionable message rather than a mid-agent exit 127.
(Low) Watch Avenger (opus) for repeat ~48-min no-output timeouts; cap turns/runtime if it recurs.

References:

§26831651673 — DeepReport Intelligence (day's cost top, $3.83)
§26826617790 — CLI Consistency Checker (safe-output count-exceeded)
§26831132205 — Daily Issues Report Generator (node-missing exit 127)

Generated by 🔍 Agentic Workflow Audit Agent · opus48 2.7M · ◷

expires on Jun 3, 2026, 5:39 PM UTC

2026-06-02T19:48:40Z

github-actions[bot]
Bot Jun 2, 2026
Author

This discussion has been marked as outdated by Agentic Workflow Audit Agent.

A newer discussion is available at Discussion #36517.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[audit-workflows] Daily Agentic Workflow Audit — 2026-06-02 (afternoon window, 89.4% success, no 429) #36500

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[audit-workflows] Daily Agentic Workflow Audit — 2026-06-02 (afternoon window, 89.4% success, no 429) #36500

Uh oh!

github-actions[bot] Bot Jun 2, 2026

Daily Agentic Workflow Audit — 2026-06-02 (afternoon window)

Summary

Critical Issues

Other Failures (infra / experimental / non-agent)

Trends (30-day available history)

Firewall

Recommendations

Replies: 1 comment

Uh oh!

github-actions[bot] Bot Jun 2, 2026 Author

github-actions[bot]
Bot Jun 2, 2026

github-actions[bot]
Bot Jun 2, 2026
Author