Skip to content

[refactor] Semantic function clustering analysis — 4 refactoring opportunities #30911

@github-actions

Description

@github-actions

Executive Summary

Semantic function clustering analysis of the github/gh-aw repository identified 765 non-test Go source files across 23 packages containing 3,761 exported and unexported functions. The codebase is largely well-organized with clear feature-based file naming (e.g., strict_mode_*.go, runtime_*.go, safe_outputs_*.go), intentional WASM build-tag pairs (*_wasm.go), and appropriate use of generics in validation helpers. Four actionable refactoring opportunities were found.


Analysis Metadata

  • Total Go Files Analyzed: 765 (non-test, pkg/ only)
  • Total Functions Cataloged: 3,761
  • Packages: 23 (workflow, cli, parser, stringutil, sliceutil, typeutil, agentdrain, and 16 more)
  • Function Clusters Identified: 12 named clusters (runtime_*, strict_mode_*, safe_outputs_*, audit_*, call_workflow_*, codemod_*, checkout_*, update_*, parse_*, validate_*, render_*, sanitize_*)
  • Outliers Found: 0 (functions are in appropriately named files)
  • Duplicates Detected: 4 groups (detailed below)
  • Detection Method: Naming pattern analysis + AST-level code comparison
  • Analysis Date: 2026-05-08

Identified Issues

1. Repeated YAML-Reading Boilerplate Across 4 Functions

Priority: High
Files: call_workflow_permissions.go, call_workflow_secrets.go, call_workflow_validation.go, dispatch_workflow_validation.go

Four separate functions contain nearly identical 12-line boilerplate for safely reading and parsing a GitHub Actions workflow YAML file:

  • extractPermissionsFromYAMLFilepkg/workflow/call_workflow_permissions.go:87
  • extractSecretsFromWorkflowFilepkg/workflow/call_workflow_secrets.go:47
  • extractWorkflowCallInputspkg/workflow/call_workflow_validation.go:183
  • extractWorkflowDispatchInputspkg/workflow/dispatch_workflow_validation.go:143

Duplicated pattern in each (identical across all 4 functions):

cleanPath := filepath.Clean(workflowPath)
if !filepath.IsAbs(cleanPath) {
    return nil, fmt.Errorf("workflow path must be absolute: %s", workflowPath)
}
content, err := os.ReadFile(cleanPath) // #nosec G304 -- Path is sanitized above
if err != nil {
    return nil, fmt.Errorf("failed to read workflow file %s: %w", workflowPath, err)
}
var workflow map[string]any
if err := yaml.Unmarshal(content, &workflow); err != nil {
    return nil, fmt.Errorf("failed to parse workflow file %s: %w", workflowPath, err)
}

Recommendation: Extract a shared helper in pkg/workflow/yaml.go:

// readWorkflowYAML reads and parses an absolute path YAML workflow file.
func readWorkflowYAML(path string) (map[string]any, error) {
    cleanPath := filepath.Clean(path)
    if !filepath.IsAbs(cleanPath) {
        return nil, fmt.Errorf("workflow path must be absolute: %s", path)
    }
    content, err := os.ReadFile(cleanPath) // #nosec G304 -- Path is validated above
    if err != nil {
        return nil, fmt.Errorf("failed to read workflow file %s: %w", path, err)
    }
    var workflow map[string]any
    if err := yaml.Unmarshal(content, &workflow); err != nil {
        return nil, fmt.Errorf("failed to parse workflow file %s: %w", path, err)
    }
    return workflow, nil
}

Each caller then becomes a one-liner delegate.

Estimated effort: 1–2 hours
Impact: Eliminates ~48 lines of duplicated security-sensitive boilerplate; single source of truth for path validation


2. mergeUnique and excludeFromSlice Belong in sliceutil

Priority: Medium
File: pkg/workflow/runtime_definitions.go:248–286

Two general-purpose string slice utilities are defined inside the workflow package, making them inaccessible to other packages:

  • mergeUnique(base []string, extra ...string) []string — deduplicated merge, used in imports.go, runtime_definitions.go, compiler_safe_outputs_config.go, safe_outputs_config_generation.go
  • excludeFromSlice(base []string, exclude ...string) []string — filters a slice by exclusion set, used in safe_outputs_config_generation.go, compiler_safe_outputs_config.go

These parallel sliceutil.Filter and sliceutil.Deduplicate but are inaccessible outside workflow.

Recommendation: Move both to pkg/sliceutil/sliceutil.go with generic type parameters:

func MergeUnique[T comparable](base []T, extra ...T) []T { ... }
func Exclude[T comparable](base []T, exclude ...T) []T { ... }

Estimated effort: 2–3 hours (including updating 6 call sites)
Impact: Centralizes general-purpose slice utilities; enables reuse across cli and parser packages


3. audit_diff_render.go — 16 Paired Format-Dispatch Functions

Priority: Low
File: pkg/cli/audit_diff_render.go (~815 lines)

The file contains 8 paired render*MarkdownSection/render*PrettySection function pairs for each diff data type. Adding a new diff section requires two functions, and the dispatch logic is duplicated at both the top-level and per-section level.

Pairs found: FirewallDiff, MCPToolsDiff, RunMetricsDiff, TokenUsageDiff, ToolCallsDiff, BashCommandsDiff, GitHubRateLimitDiff, and the top-level AuditDiff.

Recommendation: Introduce a DiffSectionRenderer interface with RenderMarkdown() / RenderPretty() methods, or use a format enum parameter. This is a longer-term structural improvement.

Estimated effort: 4–6 hours
Impact: Reduces file size by ~30%; makes adding new diff sections simpler


4. SanitizeName Base Function Could Move to stringutil

Priority: Low
File: pkg/workflow/strings.go:140

SanitizeName(name string, opts *SanitizeOptions) string is a general character-replacement sanitizer driven by a config struct, with no workflow-specific dependencies. It belongs alongside SanitizeIdentifierName, SanitizeParameterName, etc. in pkg/stringutil/sanitize.go.

Workflow-specific wrappers (SanitizeWorkflowName, SanitizeArtifactIdentifier) would remain in pkg/workflow/strings.go.

Estimated effort: 1–2 hours
Impact: Unifies the sanitization API surface; promotes reuse


What Was Examined and Found Correctly Organized

View correctly-organized patterns (not refactoring candidates)
  • *_wasm.go build-tag pairs: All //go:build js || wasm stubs are intentional Go cross-compilation practice.
  • update_entity_helpers.go + specific files: Generic base + typed thin wrappers — well-structured delegation.
  • SanitizeWorkflowName vs NormalizeWorkflowName: Distinct purposes (path sanitization vs. extension stripping), well-documented in strings.go:15–42.
  • validateNoDuplicateCacheIDs: Already delegates to the generic validateNoDuplicateIDs[T] — optimally structured.
  • strict_mode_*.go cluster: Six files covering distinct validation domains. Well-partitioned.
  • runtime_*.go cluster: Seven files with clear single-responsibility splits.
  • call_workflow_permissions.go vs call_workflow_secrets.go: Parallel structure is intentional (same file-priority resolution, different data extraction).
  • codemod_*.go files: 35+ files, one per workflow migration — correct design.

Implementation Checklist

  • Issue 1 (High): Extract readWorkflowYAML helper into pkg/workflow/yaml.go; update 4 callers
  • Issue 2 (Medium): Move mergeUnique + excludeFromSlice to pkg/sliceutil/sliceutil.go as generics; update 6 call sites
  • Issue 3 (Low): Evaluate renderer interface approach for audit_diff_render.go
  • Issue 4 (Low): Move SanitizeName + SanitizeOptions to pkg/stringutil/sanitize.go

References:

Generated by Semantic Function Refactoring · ● 565.2K ·

  • expires on May 10, 2026, 12:11 AM UTC

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions