📊 Agentic Workflow Lock File Statistics - 2025-12-23 #7404
Closed
Replies: 1 comment
-
|
This discussion was automatically closed because it was created by an agentic workflow more than 3 days ago. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
This comprehensive analysis examines 125 lock files (
.lock.yml) in the.github/workflows/directory of the githubnext/gh-aw repository. These lock files represent agentic workflows that leverage AI agents (Claude, Copilot, Codex) to automate repository tasks.Key Findings:
workflow_dispatchandpull_requestFull Report
File Size Distribution
The lock files are substantial, reflecting the comprehensive agent instructions and tool configurations embedded in each workflow.
Statistics:
Size Insights
Nearly all lock files exceed 100 KB, indicating that agentic workflows contain:
Trigger Analysis
Most Popular Triggers
workflow_dispatchpull_requestscheduleissuesissue_commentKey Observations
workflow_dispatch, enabling on-demand executionCommon Trigger Combinations
The most common pattern combines:
schedule(cron-based)issuesissue_commentpull_requestThis combination enables workflows that:
Schedule Patterns
0 14 * * 1-5(2 PM Mon-Fri)0 0,6,12,18 * * *(Every 6 hours)0 9 * * 10 6 * * 00/10 * * * *0 * * * *Most Common Schedules:
0 14 * * 1-5(2 PM weekdays) - 5 workflows0 13 * * 1-5(1 PM weekdays) - 4 workflows0 11 * * 1-5(11 AM weekdays) - 4 workflowsThe repository favors weekday business hours (9 AM - 4 PM UTC) for scheduled tasks, suggesting these workflows perform maintenance, reporting, and monitoring during active development periods.
Safe Outputs Analysis
Agentic workflows use "safe outputs" - GitHub API tools that allow agents to create discussions, issues, comments, and PRs without direct repository write access.
Safe Output Pattern
All workflows in this repository follow a consistent safe output pattern:
Available Safe Output Types
Based on the codebase structure, workflows have access to:
create-discussion- Create GitHub discussionscreate-issue- Create GitHub issuesadd-comment- Add comments to issues/PRscreate-pull-request- Create PRscreate-pull-request-review-comment- Add PR review commentsupdate-issue- Update existing issuesnoop- Log completion with no actionmissing_tool- Report missing capabilitiesDiscussion Categories
When creating discussions, workflows commonly target:
The "audits" category is the most popular, reflecting the repository's focus on automated security scanning, compliance checks, and workflow analysis.
Structural Characteristics
Job Complexity
Job Structure
The analysis shows 0 average jobs per workflow in the parsed data, suggesting:
^ [a-z_-]*:$pattern)Typical Lock File Profile
Based on statistical analysis, a typical
.lock.ymlfile has:workflow_dispatch+pull_request+ optionalschedulePermission Patterns
Workflows request specific GitHub API permissions following the principle of least privilege.
Permission Distribution
The nearly even split between read and write (50.5% vs 49.4%) indicates:
Workflows with Most Permissions
Top 5 workflows by permission count:
These workflows likely handle complex operations requiring multiple API interactions.
Tool & MCP Server Patterns
Most Used MCP Servers
GitHub MCP Server - API Usage
The GitHub MCP server provides 53 unique API functions. Most popular:
search_*functionslist_*functionspull_request_readKey Insight: Each function appears ~62 times, suggesting standardized agent configurations across workflows with a common set of available GitHub API tools.
Playwright MCP Server
Provides 10+ browser automation functions:
browser_wait_for- Wait for elementsbrowser_type- Type textbrowser_take_screenshot- Capture screenshotsbrowser_tabs- Tab managementbrowser_snapshot- DOM snapshotsbrowser_select_option- Form interactionsbrowser_resize- Viewport controlbrowser_press_key- Keyboard inputbrowser_network_requests- Network monitoring97.6% of workflows have browser automation capability, enabling agents to:
Timeout & Execution Patterns
Timeout Configuration
Average Timeout: 16 minutes
Key Findings:
Concurrency Management
Engine & Model Information
Model References
Workflows reference AI models dynamically:
${{ steps.generate_aw_info.outputs.model }}- 123 workflows (runtime selection)GH_AW_MODEL_AGENT_COPILOT- Copilot model selectionGH_AW_MODEL_AGENT_CLAUDE- Claude model selectionGH_AW_MODEL_AGENT_CODEX- Codex model selectionDefault Models Observed:
gpt-5- 35 referencesgpt-4o-mini- 2 referencesMulti-Engine Architecture
The repository supports multiple AI engines:
This multi-engine approach provides:
Interesting Findings
1. Standardization at Scale
All 125 workflows follow remarkably consistent patterns:
This suggests:
2. Comprehensive GitHub API Coverage
With 53 unique GitHub API functions available, agents can:
This extensive API access enables sophisticated automation scenarios.
3. Browser Automation is Standard
97.6% of workflows include Playwright, indicating:
4. Business Hours Bias
Scheduled workflows strongly prefer weekday business hours (9 AM - 4 PM UTC):
5. Size Consistency Despite Complexity Variation
Despite 76-step average with outliers at 113 steps, all files stay in 80-690 KB range:
6. Concurrency is Critical
98.4% of workflows use concurrency groups, preventing:
Example Workflows by Category
Issue-Triggered Workflows (10.4%)
ai-moderator.lock.yml- Content moderationarchie.lock.yml- Archive managementcampaign-generator.lock.yml- Campaign creationcloclo.lock.yml- Code analysiscraft.lock.yml- Artifact generationScheduled Audit/Report Workflows
agent-performance-analyzer.lock.yml- Performance metricsartifacts-summary.lock.yml- Artifact reportsaudit-workflows.lock.yml- Workflow complianceblog-auditor.lock.yml- Blog content checkingbreaking-change-checker.lock.yml- API compatibilityHigh-Complexity Workflows (100+ steps)
daily-file-diet.lock.yml- 113 steps - File cleanup automationpoem-bot.lock.yml- 107 steps - Creative content generationdeep-report.lock.yml- 105 steps - Comprehensive analysisdaily-firewall-report.lock.yml- 101 steps - Security scanningintelligence.lock.yml- 100 steps - Data aggregationRecommendations
1. Optimize Large Workflows
The top 5 workflows with 100+ steps may benefit from:
Benefit: Faster execution, easier debugging, better maintainability.
2. Standardize Timeout Values
With 83% of workflows using 10-20 minute timeouts:
Benefit: Predictable resource usage, faster failure detection.
3. Document Schedule Rationale
85 workflows run on schedules with varying frequencies:
Benefit: Prevent resource spikes, improve scheduling efficiency.
4. Monitor Permission Creep
With 734 write permissions granted:
Benefit: Reduced security risk, better compliance.
5. Leverage Template Patterns
The high consistency across workflows suggests:
Benefit: Faster workflow development, fewer errors, easier updates.
6. Establish Size Budgets
With an average of 390 KB per lock file:
Benefit: Faster loading, better version control diffs, easier reviews.
7. Create Workflow Health Dashboard
Based on the collected statistics:
Benefit: Proactive issue detection, data-driven optimization.
Historical Trends
This is the baseline analysis for the githubnext/gh-aw repository. Future runs will track:
Data saved to:
/tmp/gh-aw/cache-memory/history/2025-12-23.jsonMethodology
Data Collection
.lock.ymlfiles in.github/workflows//tmp/gh-aw/cache-memory/data/Analysis Scripts Created
analyze_lockfiles.sh- Main data extractiongenerate_stats.sh- Statistical calculationsdetailed_analysis.sh- Pattern analysisfind_examples.sh- Example discoverygithub_api_analysis.sh- API usage trackingValidation
Reproducibility
All scripts saved to
/tmp/gh-aw/cache-memory/scripts/for future analysis runs.Analysis performed: 2025-12-23
Repository: githubnext/gh-aw
Lock files analyzed: 125
Scripts stored:
/tmp/gh-aw/cache-memory/scripts/Data cached:
/tmp/gh-aw/cache-memory/data/References:
Beta Was this translation helpful? Give feedback.
All reactions