Skip to content

[refactor] Semantic function clustering: Antigravity/Gemini engine clone + scattered indentation helpersΒ #36552

@github-actions

Description

@github-actions

πŸ”§ Semantic Function Clustering Analysis

Analysis of repository: github/gh-aw β€” non-test Go sources under pkg/ (879 source files).

This run clustered functions by file/naming/purpose and looked for outliers (functions in the wrong file), duplicate implementations, and scattered helpers. The headline is good news: the codebase is exceptionally well-organized. It follows a strict feature-per-file convention (compiler_*, safe_outputs_*, mcp_*, expression_*, frontmatter_*, *_validation.go, create_*/update_*/close_*), the create/update/close entity parsers are already deduplicated through generics + registry patterns, and shared utilities live in dedicated packages (stringutil, sliceutil, fileutil, ...) that the rest of the code actually uses (slice-contains reimplementations were found only in test files).

The findings below are therefore relatively narrow. The one high-impact item is a near-verbatim duplicated engine.

Summary

Metric Value
Source files scanned (pkg/**, excl. _test.go) 879
Dominant packages pkg/workflow (395), pkg/cli (310)
Confirmed duplicate clusters 1 high-impact + 2 minor
Outliers / redundant indirection 2 (low)
Overall organization βœ… Strong (feature-per-file, generics-based dedup already applied)

Critical Finding

1. The Antigravity engine is a near-verbatim clone of the Gemini engine β€” High impact

All four Antigravity engine files are byte-for-byte copies of their Gemini counterparts, differing only in identifiers (Gemini→Antigravity) and a couple of string literals. File sizes line up almost exactly:

File pair Lines (antigravity / gemini) Difference
*_logs.go 107 / 107 Type/receiver names + script-id literal only
*_mcp.go 17 / 17 Type/receiver/log names only
*_tools.go 184 / 185 Type names + settings-dir literals only
*_engine.go 347 / 353 Type names + a few literals

Evidence β€” ParseLogMetrics is identical logic (antigravity_logs.go vs gemini_logs.go):

// gemini_logs.go:21 and antigravity_logs.go:21 β€” identical except the type name
func (e *GeminiEngine) ParseLogMetrics(logContent string, verbose bool) LogMetrics {
    metrics := LogMetrics{Turns: 0, TokenUsage: 0, ToolCalls: []ToolCallInfo{}}
    toolCallCounts := make(map[string]int)
    lines := strings.SplitSeq(logContent, "\n")
    for line := range lines {
        // ... identical body: parse JSON line, sum input/output tokens from
        //     stats["models"], aggregate stats["tools"] into toolCallCounts ...
    }
    // ... identical tail ...
}

RenderMCPConfig is also identical β€” both delegate to renderDefaultJSONMCPConfig(...) with the same arguments. GeminiResponse / AntigravityResponse are identical structs, and computeAntigravityToolsCore even maps to the same Gemini-CLI built-in tool names (run_shell_command, replace, write_file, glob, grep_search, list_directory, read_file, read_many_files).

Recommendation: Both are "single-JSON-response CLI" engines, so factor the shared behavior into one place while preserving per-engine extension points (the engines may legitimately diverge later):

  • Extract the log parser into a shared helper, e.g. parseSingleJSONResponseLogMetrics(logContent string, verbose bool, log *logger.Logger) LogMetrics, called by both ParseLogMetrics methods.
  • Share the MCP renderer (already trivial β€” both just call renderDefaultJSONMCPConfig).
  • Parameterize the tools-core mapping / settings-step generator by (settingsDir, convertScript) instead of duplicating ~185 lines.

Caveat / honest nuance: this duplication may be intentional scaffolding so the two CLIs can diverge independently. If divergence is expected soon, a lighter touch (sharing only ParseLogMetrics + the tool-name table) is safer than a full base type. Worth a maintainer decision rather than a blind merge.

Minor Findings

2. Scattered indentation helpers across cli / parser / workflow (low–medium)

Leading-whitespace extraction/application is reimplemented in several packages instead of living in stringutil:

Location Function Purpose
pkg/cli/yaml_frontmatter_utils.go:48 getIndentation(line string) string leading whitespace (as string)
pkg/parser/frontmatter_hash.go:408 indentationOf(line string) int leading whitespace (as length)
pkg/cli/codemod_network_firewall.go:256 indentLines(lines []string, indent string) []string prefix each line
pkg/workflow/frontmatter_extraction_yaml.go:18 (c *Compiler) indentYAMLLines(...) prefix YAML lines
pkg/cli/codemod_run_install_scripts.go:167 detectFrontmatterIndent(lines []string) string detect indent unit

getIndentation and indentationOf are the same computation expressed two ways:

// pkg/cli/yaml_frontmatter_utils.go
func getIndentation(line string) string {
    return line[:len(line)-len(strings.TrimLeft(line, " \t"))]
}
// pkg/parser/frontmatter_hash.go
func indentationOf(line string) int {
    return len(line) - len(strings.TrimLeft(line, " \t"))
}

Recommendation: add stringutil.LeadingWhitespace(line string) string (and have callers take len(...) when they need the int form) plus a shared stringutil.IndentLines(lines, indent); route the five call sites through them. Low risk, improves discoverability.

3. uniqueSorted reimplements sliceutil.Deduplicate + sort, and is misplaced (low)

pkg/workflow/central_slash_command_workflow.go:487:

func uniqueSorted(values []string) []string {
    seen := make(map[string]bool, len(values))
    for _, v := range values { seen[v] = true }
    result := make([]string, 0, len(seen))
    for v := range seen { result = append(result, v) }
    sort.Strings(result)
    return result
}

This is a generic slice utility living inside a domain file. pkg/sliceutil already exposes Deduplicate[T comparable], so the body reduces to:

out := sliceutil.Deduplicate(values)
sort.Strings(out)
return out

Recommendation: either inline via sliceutil.Deduplicate at the call site, or add sliceutil.SortedUnique(values) and move it out of the slash-command file (outlier β†’ utility package).

4. Redundant wrapper: escapeSingleQuotedYAMLString (low)

pkg/workflow/central_slash_command_workflow.go:500:

func escapeSingleQuotedYAMLString(input string) string {
    return escapeYAMLSingleQuoted(input)
}

A pass-through wrapper that only forwards to escapeYAMLSingleQuoted. Unless it exists for a naming-compat reason, inline the single delegate and drop the indirection.

Positive Observations (what's already done well)

  • Feature-per-file is consistently applied. The 395 files in pkg/workflow are cleanly themed (compiler_*, safe_outputs_*, mcp_*, expression_*, frontmatter_*, engine quadruples *_engine/_mcp/_logs/_tools).
  • Entity parsing is already DRY. create_entity_helpers.go, update_entity_helpers.go, and close_entity_helpers.go use generics (parseCreateEntityConfig[T], parseUpdateEntityConfigTyped[T]) and a registry (closeEntityRegistry) rather than copy-paste β€” and close_entity_helpers.go even documents why it groups vs. splits.
  • Utility packages are real and used. sliceutil (Filter, Map, Deduplicate, Any, MergeUnique, Exclude), stringutil, fileutil provide the common helpers; ad-hoc slice-contains reimplementations appear only in _test.go files.

Recommendations (prioritized)

  1. Decide on the Gemini/Antigravity duplication. Highest impact (~650 duplicated lines). Either consolidate behind shared helpers/base, or document that the duplication is deliberate for future divergence. Est. effort: 3–5h.
  2. Centralize indentation helpers into stringutil and route the 5 call sites through them. Est. effort: 1–2h.
  3. Replace uniqueSorted with sliceutil and remove the escapeSingleQuotedYAMLString pass-through. Est. effort: <1h.

Implementation Checklist

  • Maintainer decision on Gemini ↔ Antigravity engine consolidation
  • Extract parseSingleJSONResponseLogMetrics shared log parser (if consolidating)
  • Parameterize tools-core/settings-step generation by settings dir + convert script
  • Add stringutil.LeadingWhitespace / IndentLines; migrate getIndentation, indentationOf, indentLines, indentYAMLLines
  • Replace uniqueSorted with sliceutil.Deduplicate + sort; relocate out of slash-command file
  • Inline/remove escapeSingleQuotedYAMLString
  • Run go test ./... after each change

Analysis Metadata

  • Source files scanned: 879 (pkg/**, excluding _test.go)
  • Detection method: file/naming/purpose clustering + targeted body comparison via LSP-style search and direct reads
  • Confirmed duplicate clusters: 1 high-impact (Gemini/Antigravity engine), 2 minor (indentation helpers, uniqueSorted)
  • Outliers / redundant indirection: 2 low-severity
  • Analysis date: 2026-06-03

References: Β§26855728637

Generated by πŸ”§ Semantic Function Refactoring Β· opus48 2.5M Β· β—·

  • expires on Jun 5, 2026, 12:25 AM UTC

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions