Summary
The weekly compliance audit (scripts/compliance-audit.sh) files findings based on single GitHub API reads at scan time. When a read returns a transient error (HTTP 404/403 on a resource that actually exists and is compliant), the audit records a false-positive finding and opens a dev-lead issue for it. The dev-lead agent then spends a full run investigating, only to confirm there was nothing to fix.
This surfaced concretely while re-triggering the stale-issue backlog (see #431): 2 of 5 re-triggered findings were false positives caused by transient read failures.
Evidence
Both findings were re-triggered on 2026-06-10 and picked up by dev-lead, which verified the underlying state was already compliant:
| Issue |
Finding |
What the agent found |
| broodly#99 |
unpinned-actions-agent-shield.yml |
agent-shield.yml was already SHA-pinned on main (376a4fcb… # v2, merged 2026-06-08). The audit detected an earlier @v1 state. Agent opened no PR (correct no-op). |
| .github-private#61 |
codeowners-org-leads-not-first |
Agent's own note: "the compliance audit received HTTP 404 when trying to read .github/CODEOWNERS via the GitHub API at scan time. The file already existed and was compliant." Resulted in defensive PR #552. |
Impact
- Wasted dev-lead runs (agent time + tokens) investigating non-issues.
- Noise in the fleet's open-issue list and the re-trigger sweep backlog.
- Erodes trust in compliance findings — false positives make real findings easier to ignore.
Root Cause
Single-attempt API reads in the audit with no retry/confirmation. A transient 404/403/5xx (eventual consistency, rate-limit, brief unavailability) is treated as authoritative "resource missing / non-compliant."
Recommended Actions
- Retry transient reads in
scripts/compliance-audit.sh — wrap resource reads (CODEOWNERS, workflow files, settings) in a small bounded retry with backoff; only treat a resource as missing after N consistent failures.
- Distinguish "unreadable" from "non-compliant" — a read that errors should not map to a compliance failure. Emit a separate
audit-error/inconclusive outcome (logged, not filed as a dev-lead issue) so transient errors never become findings.
- Re-confirm before filing — for any negative finding, do one confirming re-read before creating/updating the issue.
- (Optional) Auto-close on non-reproduction — the audit already has a "resolved/removed" pass; ensure a finding that no longer reproduces closes its issue promptly so stale false positives self-clean.
Context
Discovered during the #431 re-trigger work (PR #432). Part of the Compliance program initiative (GH Project #1 Initiatives).
Summary
The weekly compliance audit (
scripts/compliance-audit.sh) files findings based on single GitHub API reads at scan time. When a read returns a transient error (HTTP 404/403 on a resource that actually exists and is compliant), the audit records a false-positive finding and opens adev-leadissue for it. The dev-lead agent then spends a full run investigating, only to confirm there was nothing to fix.This surfaced concretely while re-triggering the stale-issue backlog (see #431): 2 of 5 re-triggered findings were false positives caused by transient read failures.
Evidence
Both findings were re-triggered on 2026-06-10 and picked up by dev-lead, which verified the underlying state was already compliant:
unpinned-actions-agent-shield.ymlagent-shield.ymlwas already SHA-pinned onmain(376a4fcb… # v2, merged 2026-06-08). The audit detected an earlier@v1state. Agent opened no PR (correct no-op).codeowners-org-leads-not-first.github/CODEOWNERSvia the GitHub API at scan time. The file already existed and was compliant." Resulted in defensive PR #552.Impact
Root Cause
Single-attempt API reads in the audit with no retry/confirmation. A transient
404/403/5xx(eventual consistency, rate-limit, brief unavailability) is treated as authoritative "resource missing / non-compliant."Recommended Actions
scripts/compliance-audit.sh— wrap resource reads (CODEOWNERS, workflow files, settings) in a small bounded retry with backoff; only treat a resource as missing after N consistent failures.audit-error/inconclusiveoutcome (logged, not filed as adev-leadissue) so transient errors never become findings.Context
Discovered during the #431 re-trigger work (PR #432). Part of the Compliance program initiative (GH Project #1 Initiatives).