[prompt-clustering] Copilot Agent Prompt Clustering — 30-Day Analysis (1,000 PRs, 8 task themes) #36103
Closed
Replies: 1 comment
-
|
This discussion was automatically closed because it expired on 2026-06-01T10:45:30.289Z.
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Summary
Clustering analysis of 1,000 Copilot agent pull requests in
github/gh-awover the last 30 days (2026-05-12 → 2026-05-31). Prompts were extracted from PR titles + bodies, vectorized with TF-IDF (1–2 grams), and grouped with K-means;k=8was selected by silhouette score.Key Findings
Cluster Analysis
Task complexity by cluster (chart)
Cluster theme details & representative PRs
Runtime fixes & generated code — 220 PRs · 75.9% success
General bug/runtime fixes touching generated code and model/runtime config. Largest cluster, broadest scope, mid-pack success. Examples: #31917, #35286, #35773.
Safe-outputs / error handling — 195 PRs · 88.7% success
Safe-output paths, error handling, behavior/coverage. Tight diffs (avg 18 files), highest-volume and high success. Examples: #33350, #32273, #32655.
Shared package refactors — 183 PRs · 77.6% success
Refactors into shared packages/helpers; very large additions (avg +9,227 lines, inflated by generated-code PRs). Examples: #32117, #35778, #36006.
Prompt & token optimization — 162 PRs · 86.4% success
Agent prompt tuning, token/turn reduction, guidance edits. Smallest diffs, second-highest success. Examples: #34874, #35817, #35650.
Firewall / network egress rules — 94 PRs · 74.5% success
Network/egress allow-list, triggering-command and MCP network schema work. Examples: #33240, #33386, #33683.
Smoke tests & engine config — 57 PRs · 59.6% success
Smoke tests, Claude/engine config, domain allow-lists. Lowest success, highest churn. Examples: #33273, #33852, #35802.
PR Sous Chef workflow — 54 PRs · 88.9% success
Iterations on the PR Sous Chef workflow. Most commits/PR (8.2) but high merge rate. Examples: #36088 and related Sous Chef PRs.
CI / failing Actions fixes — 34 PRs · 76.5% success
Fixing failing GitHub Actions jobs. Smallest, lowest-iteration cluster (avg 3.0 commits).
Representative PRs per cluster (data table)
label_commandrouting via `agentic_commagh aw initto create the Agentic Workflows custom aaw.ymlpackage resolution to `gh aw adsetup-gh-awinstall idempotent whengh-awis alreadnetworkschema deprecation semantics withon.pull_request_reviewer: slash_commandsynthetic trigcreate-check-runsafe output type for multi-agent PR acheckout.clean-git-credentialsto support submodule-saRecommendations
Methodology & limitations
copilot-prs.json(1,000 PRs, refreshed 2026-05-31) enriched with per-PR full data (comments/reviews/commits/files) — all 1,000 had full metrics available.aw_info.json) were not joined — matching 1,000 PRs to individual workflow runs by timestamp is unreliable, so this was intentionally omitted rather than approximated.References: §26710025727
Beta Was this translation helpful? Give feedback.
All reactions