[prompt-clustering] Copilot Agent Prompt Clustering Analysis - 2025-11-28 #5028

2025-11-28T19:28:15Z

github-actions[bot]
Bot Nov 28, 2025

🔬 Copilot Agent Prompt Clustering Analysis

Daily NLP-based clustering analysis of copilot agent task prompts using K-means clustering and TF-IDF vectorization.

Summary

Analysis Period: 2025-10-22 to 2025-11-28 (37 days)
Total Tasks Analyzed: 1,320
Clusters Identified: 6
Overall Success Rate: 75.0%
Clustering Method: K-means with TF-IDF vectorization (200 features, trigrams)

General Insights

Most Common Task Type: Updates (484 tasks, 36.7%)
Highest Success Rate: Bug Fixes (80.0% success)
Most Complex Tasks: Features (avg 1,445 lines added per PR)
Documentation Focus: 2 clusters dedicated to documentation (408 tasks, 30.9%)

Key Patterns Observed

Update-Heavy Workload: The largest cluster represents over one-third of all tasks, indicating frequent maintenance work
Documentation is Critical: Two documentation clusters account for 31% of tasks, showing strong emphasis on code clarity
Bug Fixes are Rare: Only 3% of tasks are pure bug fixes, suggesting proactive development rather than reactive fixes
Consistent Performance: All clusters maintain 70%+ merge rates regardless of task type

Full Analysis Report

Detailed Cluster Analysis

Cluster 1: Updates (36.7% of tasks)

Success Rate: 73.1%
Complexity: 19.1 files changed, 753 lines added, 3.7 commits avg
Keywords: update, firewall, make, cli, pr, agent

Tasks focused on updating existing functionality, improving configurations, and maintaining the codebase. These are maintenance and improvement tasks that keep the system up-to-date.

Example: Update the frontmatter "imports" documentation under /docs with all the supported URL and path formats (#2097)

Cluster 0: Documentation (19.4% of tasks)

Success Rate: 77.7%
Complexity: 9.7 files changed, 388 lines added, 3.1 commits avg
Keywords: githubnext gh aw, githubnext gh, githubnext, gh aw, aw, gh

Documentation tasks including issue management, smoke detector responses, and code refactoring documentation.

Examples: #2209, #2283, #2284

Cluster 4: Features - Safe Outputs & MCP (15.3% of tasks)

Success Rate: 70.8%
Complexity: 26.5 files changed, 1,432 lines added, 4.3 commits avg
Keywords: add, safe, mcp, output, job, safe output, server, mcp server

Feature addition tasks that introduce new capabilities, particularly around safe outputs, MCP servers, and job configurations.

Examples: #2127, #2137, #2167

Cluster 3: Features - Agentic Workflows (14.1% of tasks)

Success Rate: 76.9%
Complexity: 8.6 files changed, 1,445 lines added, 3.5 commits avg
Keywords: agentic, workflow, agentic workflow, daily, workflows, update

Workflow-related tasks involving GitHub Actions, agentic workflows, and automation setup.

Example: Review the scheduled agentic workflows and spread them the entire day. Schedule the smoke workflows every 6h (#2100)

Cluster 2: Documentation - Code & Issues (11.5% of tasks)

Success Rate: 78.3%
Complexity: 16.5 files changed, 712 lines added, 3.2 commits avg
Keywords: comments, issue, section, code, issue_title, cli

Issue-driven documentation and refactoring tasks, often addressing technical debt and code organization.

Examples: #2171, #2235, #2249

Cluster 5: Bug Fixes (3.0% of tasks)

Success Rate: 80.0% ⭐ (highest)
Complexity: 13.5 files changed, 195 lines added, 3.3 commits avg
Keywords: fix, tests, docs, error, issues, test

Small, well-defined bug fixes with the highest success rate, suggesting bugs are straightforward to resolve.

Examples: fix test suite (#2153), Fix npx command parsing (#2299)

Success Rate by Cluster

Cluster	Theme	Tasks	Success Rate	Avg Commits	Avg Lines
5	Bug Fixes	40	80.0%	3.3	195
2	Documentation	152	78.3%	3.2	712
0	Documentation	256	77.7%	3.1	388
3	Features	186	76.9%	3.5	1,445
1	Updates	484	73.1%	3.7	753
4	Features	202	70.8%	4.3	1,432

Key Findings

1. Updates Dominate the Workload

Updates represent 36.7% of all tasks with keywords like 'update, firewall, make, cli, pr'. These maintenance tasks keep the system functional and up-to-date.

2. Documentation Gets Significant Attention

Documentation tasks (30.9% of total) maintain above-average success rates (78.0%), showing strong commitment to code clarity and maintainability.

3. Feature Development is Substantial

Two feature clusters account for 29.4% of tasks and are the most complex (avg 1,439 lines added), requiring more commits and reviews.

4. Bug Fixes are Small but Highly Successful

Only 3.0% of tasks are bug fixes, but they achieve 80.0% success rate, suggesting bugs are well-defined and straightforward to resolve.

5. Consistent Performance Across Task Types

Success rates vary only from 70.8% to 80.0%, indicating the agent performs consistently well regardless of task complexity or type.

Recommendations

For Task Assignment

Documentation Tasks Are Ideal for Agents: 31% of tasks with 77-78% success rates make documentation a strong use case
Update Tasks Should Be Batched: Given high volume (37%), consider grouping related updates to reduce overhead
Feature Tasks Need More Review: Higher review counts (1.8-1.9 avg) suggest features benefit from additional human oversight

For Prompt Engineering

Be Specific for Update Tasks: Updates vary widely (firewall, CLI, PRs) - specific keywords improve outcomes
Separate Bug Fixes from Features: Clear separation leads to better success rates
Include Context for Complex Tasks: Feature tasks with more commits/reviews need upfront context

For Process Improvement

Monitor Feature Task Complexity: Features add ~1,400 lines on average - consider breaking into smaller subtasks
Standardize Update Prompts: The largest cluster could benefit from prompt templates for common patterns
Investigate Lower Success Rates: Cluster 4 (Features) has 70.8% success vs 80% for bug fixes - understand why

Sample PRs by Cluster

Complete PR Dataset (sample)

PR #	Title	Cluster	Theme	Outcome	Files	Lines	Commits
#2097	Add minimal path format syntax reference	1	Updates	✅	1	20	4
#2099	Add directory creation for copilot paths	1	Updates	✅	25	335	3
#2102	Add workflow status badges documentation	1	Updates	✅	6	195	4
#2209	Comment on issue #2157 recurrence	0	Documentation	✅	3	62	3
#2283	Extract functions from compiler.go	0	Documentation	✅	6	621	4
#2284	Consolidate validation functions	0	Documentation	✅	6	81	4
#2127	Fix Smoke OpenCode workflow failure	4	Features	✅	2	11	4
#2137	Fix discussion comment threading	4	Features	✅	13	442	3
#2167	Fix OpenCode MCP server integration	4	Features	✅	3	215	3
#2100	Spread scheduled workflows across day	3	Features	✅	30	43	2
#2103	Add smoke-outpost workflow	3	Features	✅	2	4,250	3
#2109	Add semantic function refactoring	3	Features	✅	3	4,351	4
#2171	Refactor duplicate MCP code patterns	2	Documentation	✅	10	570	3
#2235	Raise error on max-turns limit	2	Documentation	✅	54	484	4
#2310	Pin all GitHub Actions to SHAs	2	Documentation	✅	107	2,184	8
#2153	Fix test suite	5	Bug Fixes	✅	1	11	2
#2299	Fix npx command parsing	5	Bug Fixes	✅	5	62	6
#2320	Fix test expectation for MCP server	5	Bug Fixes	✅	2	6	3

Analysis Method: K-means clustering with TF-IDF vectorization (200 features, n-gram range 1-3)
Data Source: 1,320 copilot-created PRs from 2025-10-22 to 2025-11-28
Visualizations: Charts generated and saved to artifacts

AI generated by Copilot Agent Prompt Clustering Analysis

2025-12-02T19:33:50Z

github-actions[bot]
Bot Dec 2, 2025
Author

⚓ Avast! This discussion be marked as outdated by Copilot Agent Prompt Clustering Analysis.
🗺️ A newer treasure map awaits ye at Discussion #5325.
Fair winds, matey! 🏴‍☠️

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[prompt-clustering] Copilot Agent Prompt Clustering Analysis - 2025-11-28 #5028

Uh oh!

{{title}}

Uh oh!

Detailed Cluster Analysis

Cluster 1: Updates (36.7% of tasks)

Cluster 0: Documentation (19.4% of tasks)

Cluster 4: Features - Safe Outputs & MCP (15.3% of tasks)

Cluster 3: Features - Agentic Workflows (14.1% of tasks)

Cluster 2: Documentation - Code & Issues (11.5% of tasks)

Cluster 5: Bug Fixes (3.0% of tasks)

Success Rate by Cluster

Key Findings

1. Updates Dominate the Workload

2. Documentation Gets Significant Attention

3. Feature Development is Substantial

4. Bug Fixes are Small but Highly Successful

5. Consistent Performance Across Task Types

Recommendations

For Task Assignment

For Prompt Engineering

For Process Improvement

Sample PRs by Cluster

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[prompt-clustering] Copilot Agent Prompt Clustering Analysis - 2025-11-28 #5028

Uh oh!

github-actions[bot] Bot Nov 28, 2025

🔬 Copilot Agent Prompt Clustering Analysis

Summary

General Insights

Key Patterns Observed

Detailed Cluster Analysis

Cluster 1: Updates (36.7% of tasks)

Cluster 0: Documentation (19.4% of tasks)

Cluster 4: Features - Safe Outputs & MCP (15.3% of tasks)

Cluster 3: Features - Agentic Workflows (14.1% of tasks)

Cluster 2: Documentation - Code & Issues (11.5% of tasks)

Cluster 5: Bug Fixes (3.0% of tasks)

Success Rate by Cluster

Key Findings

1. Updates Dominate the Workload

2. Documentation Gets Significant Attention

3. Feature Development is Substantial

4. Bug Fixes are Small but Highly Successful

5. Consistent Performance Across Task Types

Recommendations

For Task Assignment

For Prompt Engineering

For Process Improvement

Sample PRs by Cluster

Replies: 1 comment

Uh oh!

github-actions[bot] Bot Dec 2, 2025 Author

github-actions[bot]
Bot Nov 28, 2025

github-actions[bot]
Bot Dec 2, 2025
Author