[prompt-clustering] Copilot Agent Prompt Clustering Analysis - 2025-11-28 #5028
Closed
Replies: 1 comment
-
|
⚓ Avast! This discussion be marked as outdated by Copilot Agent Prompt Clustering Analysis. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
🔬 Copilot Agent Prompt Clustering Analysis
Daily NLP-based clustering analysis of copilot agent task prompts using K-means clustering and TF-IDF vectorization.
Summary
Analysis Period: 2025-10-22 to 2025-11-28 (37 days)
Total Tasks Analyzed: 1,320
Clusters Identified: 6
Overall Success Rate: 75.0%
Clustering Method: K-means with TF-IDF vectorization (200 features, trigrams)
General Insights
Key Patterns Observed
Full Analysis Report
Detailed Cluster Analysis
Cluster 1: Updates (36.7% of tasks)
Success Rate: 73.1%
Complexity: 19.1 files changed, 753 lines added, 3.7 commits avg
Keywords: update, firewall, make, cli, pr, agent
Tasks focused on updating existing functionality, improving configurations, and maintaining the codebase. These are maintenance and improvement tasks that keep the system up-to-date.
Example: Update the frontmatter "imports" documentation under /docs with all the supported URL and path formats (#2097)
Cluster 0: Documentation (19.4% of tasks)
Success Rate: 77.7%
Complexity: 9.7 files changed, 388 lines added, 3.1 commits avg
Keywords: githubnext gh aw, githubnext gh, githubnext, gh aw, aw, gh
Documentation tasks including issue management, smoke detector responses, and code refactoring documentation.
Examples: #2209, #2283, #2284
Cluster 4: Features - Safe Outputs & MCP (15.3% of tasks)
Success Rate: 70.8%
Complexity: 26.5 files changed, 1,432 lines added, 4.3 commits avg
Keywords: add, safe, mcp, output, job, safe output, server, mcp server
Feature addition tasks that introduce new capabilities, particularly around safe outputs, MCP servers, and job configurations.
Examples: #2127, #2137, #2167
Cluster 3: Features - Agentic Workflows (14.1% of tasks)
Success Rate: 76.9%
Complexity: 8.6 files changed, 1,445 lines added, 3.5 commits avg
Keywords: agentic, workflow, agentic workflow, daily, workflows, update
Workflow-related tasks involving GitHub Actions, agentic workflows, and automation setup.
Example: Review the scheduled agentic workflows and spread them the entire day. Schedule the smoke workflows every 6h (#2100)
Cluster 2: Documentation - Code & Issues (11.5% of tasks)
Success Rate: 78.3%
Complexity: 16.5 files changed, 712 lines added, 3.2 commits avg
Keywords: comments, issue, section, code, issue_title, cli
Issue-driven documentation and refactoring tasks, often addressing technical debt and code organization.
Examples: #2171, #2235, #2249
Cluster 5: Bug Fixes (3.0% of tasks)
Success Rate: 80.0% ⭐ (highest)
Complexity: 13.5 files changed, 195 lines added, 3.3 commits avg
Keywords: fix, tests, docs, error, issues, test
Small, well-defined bug fixes with the highest success rate, suggesting bugs are straightforward to resolve.
Examples: fix test suite (#2153), Fix npx command parsing (#2299)
Success Rate by Cluster
Key Findings
1. Updates Dominate the Workload
Updates represent 36.7% of all tasks with keywords like 'update, firewall, make, cli, pr'. These maintenance tasks keep the system functional and up-to-date.
2. Documentation Gets Significant Attention
Documentation tasks (30.9% of total) maintain above-average success rates (78.0%), showing strong commitment to code clarity and maintainability.
3. Feature Development is Substantial
Two feature clusters account for 29.4% of tasks and are the most complex (avg 1,439 lines added), requiring more commits and reviews.
4. Bug Fixes are Small but Highly Successful
Only 3.0% of tasks are bug fixes, but they achieve 80.0% success rate, suggesting bugs are well-defined and straightforward to resolve.
5. Consistent Performance Across Task Types
Success rates vary only from 70.8% to 80.0%, indicating the agent performs consistently well regardless of task complexity or type.
Recommendations
For Task Assignment
For Prompt Engineering
For Process Improvement
Sample PRs by Cluster
Complete PR Dataset (sample)
Analysis Method: K-means clustering with TF-IDF vectorization (200 features, n-gram range 1-3)
Data Source: 1,320 copilot-created PRs from 2025-10-22 to 2025-11-28
Visualizations: Charts generated and saved to artifacts
Beta Was this translation helpful? Give feedback.
All reactions