Skip to content

Evergreen: skip agent job when no PR needs attention (stop creating false failure issues) #160

@mrjf

Description

@mrjf

Problem

Evergreen runs every 5 minutes. When no open PR is sick, the pre-step correctly writes {"selected": null, "reason": "no_prs_need_attention"} to /tmp/gh-aw/evergreen.json, but the agent job still runs. The agent correctly reports nothing to do:

Found 1 open PR(s)
No PRs need attention. Nothing to do.
...
Agent: No PRs need attention — `selected` is `null`, meaning all PRs are currently healthy.

...but it responds with prose instead of calling the noop safe-output tool. gh-aw's default policy treats "agent succeeded with zero safe outputs" as a failure and files a GitHub issue (see #156 from run 24692765726). The alarm is false — Evergreen did its job.

At every-5-min cadence, whenever the repo has no sick PRs, each run files a new failure issue. Noisy and wrong.

Fix — short-circuit the agent when there's nothing to do

Have the pre-step emit a step output, and guard the agent job (and any dependent jobs) on it. When selected is null, the agent doesn't run at all — no prose, no empty-outputs failure, no issue filed. Bonus: saves a Copilot invocation every 5 min when idle.

Change to .github/workflows/evergreen.md

In the "Find a PR that needs attention" step, after writing evergreen.json, expose selected as a step output:

# at the end of the Python block, after writing evergreen.json
with open(os.environ["GITHUB_OUTPUT"], "a") as f:
    f.write(f"selected={'' if selected is None else selected['pr_number']}\n")

And give the step an id:

steps:
  - name: Find a PR that needs attention
    id: find-pr
    env:
      ...

Then guard any downstream steps/jobs in the workflow on that output. In the source .md this means adding an if: to whatever step invokes the agent, e.g.:

  - name: Run agent
    if: steps.find-pr.outputs.selected != ''
    ...

(gh-aw will propagate this into evergreen.lock.yml on recompile. Alternatively, if gh-aw doesn't support per-step if in the frontmatter-authored section, express it via the job-level if after recompile, or add a gate step that calls exit 0 early when there's nothing to do.)

Acceptance

  • When no PR needs attention: agent job is skipped (or exits before invoking Copilot), detection and safe_outputs skip as today, no failure issue is filed.
  • When a PR does need attention: current behavior unchanged — agent runs, produces safe outputs, fix gets applied.
  • Issue [aw] Evergreen — PR Health Keeper failed #156 can be closed.

Alternatives considered (rejected)

  • Tell the agent to call noop in the prompt. Relies on the agent following instructions every time. Also still pays the Copilot cost for a no-op run every 5 min.
  • report-failure-as-issue: false under safe-outputs. Silences this false alarm but also hides genuine agent failures — too broad.

Related

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions