Executive Summary
Semantic function clustering analysis of the github/gh-aw repository identified 765 non-test Go source files across 23 packages containing 3,761 exported and unexported functions. The codebase is largely well-organized with clear feature-based file naming (e.g., strict_mode_*.go, runtime_*.go, safe_outputs_*.go), intentional WASM build-tag pairs (*_wasm.go), and appropriate use of generics in validation helpers. Four actionable refactoring opportunities were found.
Analysis Metadata
- Total Go Files Analyzed: 765 (non-test,
pkg/ only)
- Total Functions Cataloged: 3,761
- Packages: 23 (
workflow, cli, parser, stringutil, sliceutil, typeutil, agentdrain, and 16 more)
- Function Clusters Identified: 12 named clusters (
runtime_*, strict_mode_*, safe_outputs_*, audit_*, call_workflow_*, codemod_*, checkout_*, update_*, parse_*, validate_*, render_*, sanitize_*)
- Outliers Found: 0 (functions are in appropriately named files)
- Duplicates Detected: 4 groups (detailed below)
- Detection Method: Naming pattern analysis + AST-level code comparison
- Analysis Date: 2026-05-08
Identified Issues
1. Repeated YAML-Reading Boilerplate Across 4 Functions
Priority: High
Files: call_workflow_permissions.go, call_workflow_secrets.go, call_workflow_validation.go, dispatch_workflow_validation.go
Four separate functions contain nearly identical 12-line boilerplate for safely reading and parsing a GitHub Actions workflow YAML file:
extractPermissionsFromYAMLFile — pkg/workflow/call_workflow_permissions.go:87
extractSecretsFromWorkflowFile — pkg/workflow/call_workflow_secrets.go:47
extractWorkflowCallInputs — pkg/workflow/call_workflow_validation.go:183
extractWorkflowDispatchInputs — pkg/workflow/dispatch_workflow_validation.go:143
Duplicated pattern in each (identical across all 4 functions):
cleanPath := filepath.Clean(workflowPath)
if !filepath.IsAbs(cleanPath) {
return nil, fmt.Errorf("workflow path must be absolute: %s", workflowPath)
}
content, err := os.ReadFile(cleanPath) // #nosec G304 -- Path is sanitized above
if err != nil {
return nil, fmt.Errorf("failed to read workflow file %s: %w", workflowPath, err)
}
var workflow map[string]any
if err := yaml.Unmarshal(content, &workflow); err != nil {
return nil, fmt.Errorf("failed to parse workflow file %s: %w", workflowPath, err)
}
Recommendation: Extract a shared helper in pkg/workflow/yaml.go:
// readWorkflowYAML reads and parses an absolute path YAML workflow file.
func readWorkflowYAML(path string) (map[string]any, error) {
cleanPath := filepath.Clean(path)
if !filepath.IsAbs(cleanPath) {
return nil, fmt.Errorf("workflow path must be absolute: %s", path)
}
content, err := os.ReadFile(cleanPath) // #nosec G304 -- Path is validated above
if err != nil {
return nil, fmt.Errorf("failed to read workflow file %s: %w", path, err)
}
var workflow map[string]any
if err := yaml.Unmarshal(content, &workflow); err != nil {
return nil, fmt.Errorf("failed to parse workflow file %s: %w", path, err)
}
return workflow, nil
}
Each caller then becomes a one-liner delegate.
Estimated effort: 1–2 hours
Impact: Eliminates ~48 lines of duplicated security-sensitive boilerplate; single source of truth for path validation
2. mergeUnique and excludeFromSlice Belong in sliceutil
Priority: Medium
File: pkg/workflow/runtime_definitions.go:248–286
Two general-purpose string slice utilities are defined inside the workflow package, making them inaccessible to other packages:
mergeUnique(base []string, extra ...string) []string — deduplicated merge, used in imports.go, runtime_definitions.go, compiler_safe_outputs_config.go, safe_outputs_config_generation.go
excludeFromSlice(base []string, exclude ...string) []string — filters a slice by exclusion set, used in safe_outputs_config_generation.go, compiler_safe_outputs_config.go
These parallel sliceutil.Filter and sliceutil.Deduplicate but are inaccessible outside workflow.
Recommendation: Move both to pkg/sliceutil/sliceutil.go with generic type parameters:
func MergeUnique[T comparable](base []T, extra ...T) []T { ... }
func Exclude[T comparable](base []T, exclude ...T) []T { ... }
Estimated effort: 2–3 hours (including updating 6 call sites)
Impact: Centralizes general-purpose slice utilities; enables reuse across cli and parser packages
3. audit_diff_render.go — 16 Paired Format-Dispatch Functions
Priority: Low
File: pkg/cli/audit_diff_render.go (~815 lines)
The file contains 8 paired render*MarkdownSection/render*PrettySection function pairs for each diff data type. Adding a new diff section requires two functions, and the dispatch logic is duplicated at both the top-level and per-section level.
Pairs found: FirewallDiff, MCPToolsDiff, RunMetricsDiff, TokenUsageDiff, ToolCallsDiff, BashCommandsDiff, GitHubRateLimitDiff, and the top-level AuditDiff.
Recommendation: Introduce a DiffSectionRenderer interface with RenderMarkdown() / RenderPretty() methods, or use a format enum parameter. This is a longer-term structural improvement.
Estimated effort: 4–6 hours
Impact: Reduces file size by ~30%; makes adding new diff sections simpler
4. SanitizeName Base Function Could Move to stringutil
Priority: Low
File: pkg/workflow/strings.go:140
SanitizeName(name string, opts *SanitizeOptions) string is a general character-replacement sanitizer driven by a config struct, with no workflow-specific dependencies. It belongs alongside SanitizeIdentifierName, SanitizeParameterName, etc. in pkg/stringutil/sanitize.go.
Workflow-specific wrappers (SanitizeWorkflowName, SanitizeArtifactIdentifier) would remain in pkg/workflow/strings.go.
Estimated effort: 1–2 hours
Impact: Unifies the sanitization API surface; promotes reuse
What Was Examined and Found Correctly Organized
View correctly-organized patterns (not refactoring candidates)
*_wasm.go build-tag pairs: All //go:build js || wasm stubs are intentional Go cross-compilation practice.
update_entity_helpers.go + specific files: Generic base + typed thin wrappers — well-structured delegation.
SanitizeWorkflowName vs NormalizeWorkflowName: Distinct purposes (path sanitization vs. extension stripping), well-documented in strings.go:15–42.
validateNoDuplicateCacheIDs: Already delegates to the generic validateNoDuplicateIDs[T] — optimally structured.
strict_mode_*.go cluster: Six files covering distinct validation domains. Well-partitioned.
runtime_*.go cluster: Seven files with clear single-responsibility splits.
call_workflow_permissions.go vs call_workflow_secrets.go: Parallel structure is intentional (same file-priority resolution, different data extraction).
codemod_*.go files: 35+ files, one per workflow migration — correct design.
Implementation Checklist
References:
Generated by Semantic Function Refactoring · ● 565.2K · ◷
Executive Summary
Semantic function clustering analysis of the
github/gh-awrepository identified 765 non-test Go source files across 23 packages containing 3,761 exported and unexported functions. The codebase is largely well-organized with clear feature-based file naming (e.g.,strict_mode_*.go,runtime_*.go,safe_outputs_*.go), intentional WASM build-tag pairs (*_wasm.go), and appropriate use of generics in validation helpers. Four actionable refactoring opportunities were found.Analysis Metadata
pkg/only)workflow,cli,parser,stringutil,sliceutil,typeutil,agentdrain, and 16 more)runtime_*,strict_mode_*,safe_outputs_*,audit_*,call_workflow_*,codemod_*,checkout_*,update_*,parse_*,validate_*,render_*,sanitize_*)Identified Issues
1. Repeated YAML-Reading Boilerplate Across 4 Functions
Priority: High
Files:
call_workflow_permissions.go,call_workflow_secrets.go,call_workflow_validation.go,dispatch_workflow_validation.goFour separate functions contain nearly identical 12-line boilerplate for safely reading and parsing a GitHub Actions workflow YAML file:
extractPermissionsFromYAMLFile—pkg/workflow/call_workflow_permissions.go:87extractSecretsFromWorkflowFile—pkg/workflow/call_workflow_secrets.go:47extractWorkflowCallInputs—pkg/workflow/call_workflow_validation.go:183extractWorkflowDispatchInputs—pkg/workflow/dispatch_workflow_validation.go:143Duplicated pattern in each (identical across all 4 functions):
Recommendation: Extract a shared helper in
pkg/workflow/yaml.go:Each caller then becomes a one-liner delegate.
Estimated effort: 1–2 hours
Impact: Eliminates ~48 lines of duplicated security-sensitive boilerplate; single source of truth for path validation
2.
mergeUniqueandexcludeFromSliceBelong insliceutilPriority: Medium
File:
pkg/workflow/runtime_definitions.go:248–286Two general-purpose string slice utilities are defined inside the
workflowpackage, making them inaccessible to other packages:mergeUnique(base []string, extra ...string) []string— deduplicated merge, used inimports.go,runtime_definitions.go,compiler_safe_outputs_config.go,safe_outputs_config_generation.goexcludeFromSlice(base []string, exclude ...string) []string— filters a slice by exclusion set, used insafe_outputs_config_generation.go,compiler_safe_outputs_config.goThese parallel
sliceutil.Filterandsliceutil.Deduplicatebut are inaccessible outsideworkflow.Recommendation: Move both to
pkg/sliceutil/sliceutil.gowith generic type parameters:Estimated effort: 2–3 hours (including updating 6 call sites)
Impact: Centralizes general-purpose slice utilities; enables reuse across
cliandparserpackages3.
audit_diff_render.go— 16 Paired Format-Dispatch FunctionsPriority: Low
File:
pkg/cli/audit_diff_render.go(~815 lines)The file contains 8 paired
render*MarkdownSection/render*PrettySectionfunction pairs for each diff data type. Adding a new diff section requires two functions, and the dispatch logic is duplicated at both the top-level and per-section level.Pairs found:
FirewallDiff,MCPToolsDiff,RunMetricsDiff,TokenUsageDiff,ToolCallsDiff,BashCommandsDiff,GitHubRateLimitDiff, and the top-levelAuditDiff.Recommendation: Introduce a
DiffSectionRendererinterface withRenderMarkdown()/RenderPretty()methods, or use a format enum parameter. This is a longer-term structural improvement.Estimated effort: 4–6 hours
Impact: Reduces file size by ~30%; makes adding new diff sections simpler
4.
SanitizeNameBase Function Could Move tostringutilPriority: Low
File:
pkg/workflow/strings.go:140SanitizeName(name string, opts *SanitizeOptions) stringis a general character-replacement sanitizer driven by a config struct, with no workflow-specific dependencies. It belongs alongsideSanitizeIdentifierName,SanitizeParameterName, etc. inpkg/stringutil/sanitize.go.Workflow-specific wrappers (
SanitizeWorkflowName,SanitizeArtifactIdentifier) would remain inpkg/workflow/strings.go.Estimated effort: 1–2 hours
Impact: Unifies the sanitization API surface; promotes reuse
What Was Examined and Found Correctly Organized
View correctly-organized patterns (not refactoring candidates)
*_wasm.gobuild-tag pairs: All//go:build js || wasmstubs are intentional Go cross-compilation practice.update_entity_helpers.go+ specific files: Generic base + typed thin wrappers — well-structured delegation.SanitizeWorkflowNamevsNormalizeWorkflowName: Distinct purposes (path sanitization vs. extension stripping), well-documented instrings.go:15–42.validateNoDuplicateCacheIDs: Already delegates to the genericvalidateNoDuplicateIDs[T]— optimally structured.strict_mode_*.gocluster: Six files covering distinct validation domains. Well-partitioned.runtime_*.gocluster: Seven files with clear single-responsibility splits.call_workflow_permissions.govscall_workflow_secrets.go: Parallel structure is intentional (same file-priority resolution, different data extraction).codemod_*.gofiles: 35+ files, one per workflow migration — correct design.Implementation Checklist
readWorkflowYAMLhelper intopkg/workflow/yaml.go; update 4 callersmergeUnique+excludeFromSlicetopkg/sliceutil/sliceutil.goas generics; update 6 call sitesaudit_diff_render.goSanitizeName+SanitizeOptionstopkg/stringutil/sanitize.goReferences: