Harden go-gh remote fetch callsites with escaped contents paths/refs and bounded REST clients#41036
Conversation
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR hardens go-gh REST callsites in gh-aw by centralizing Contents API endpoint construction (with escaped refs) and standardizing REST client creation to include a default timeout, reducing failures on edge-case refs and slow/hung networks.
Changes:
- Added a shared
buildContentsAPIPath()helper and switched multiple Contents API callsites to use it. - Updated
createRESTClientForHost()and select CLI callsites to construct REST clients withconstants.DefaultHTTPClientTimeout. - Added unit tests for the new helpers and client option builders.
Show a summary per file
| File | Description |
|---|---|
| pkg/parser/remote_fetch.go | Centralizes Contents API endpoint building and uses a timeout-bounded REST client helper. |
| pkg/parser/remote_fetch_test.go | Adds unit coverage for ref escaping in Contents API paths. |
| pkg/cli/update_check.go | Uses a shared options helper for github.com REST client creation, including timeout. |
| pkg/cli/update_check_test.go | Tests the new github.com REST client options helper. |
| pkg/cli/secret_set_command.go | Routes REST client creation through a shared options helper with timeout (and host override). |
| pkg/cli/secret_set_command_test.go | Adds tests for the secret-set REST client options helper. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 6/6 changed files
- Comments generated: 3
| func buildContentsAPIPath(owner, repo, path, ref string) string { | ||
| return fmt.Sprintf("repos/%s/%s/contents/%s?ref=%s", owner, repo, path, url.QueryEscape(ref)) | ||
| } |
| func secretSetClientOptions(apiBase string) api.ClientOptions { | ||
| opts := api.ClientOptions{ | ||
| Timeout: constants.DefaultHTTPClientTimeout, | ||
| } | ||
| if apiBase != "" { | ||
| opts.Host = strings.TrimPrefix(strings.TrimPrefix(apiBase, "https://"), "http://") | ||
| } | ||
| return opts | ||
| } |
| func TestSecretSetClientOptions(t *testing.T) { | ||
| t.Run("defaults include timeout", func(t *testing.T) { | ||
| opts := secretSetClientOptions("") | ||
| if opts.Host != "" { | ||
| t.Fatalf("expected empty host, got %q", opts.Host) | ||
| } | ||
| if opts.Timeout != constants.DefaultHTTPClientTimeout { | ||
| t.Fatalf("expected timeout %s, got %s", constants.DefaultHTTPClientTimeout, opts.Timeout) | ||
| } | ||
| }) | ||
|
|
||
| t.Run("normalizes host when api-url is provided", func(t *testing.T) { | ||
| opts := secretSetClientOptions("https://ghe.example.com") | ||
| if opts.Host != "ghe.example.com" { | ||
| t.Fatalf("expected host ghe.example.com, got %q", opts.Host) | ||
| } | ||
| if opts.Timeout != constants.DefaultHTTPClientTimeout { | ||
| t.Fatalf("expected timeout %s, got %s", constants.DefaultHTTPClientTimeout, opts.Timeout) | ||
| } | ||
| }) | ||
| } |
|
✅ Test Quality Sentinel completed test quality analysis. |
|
✅ PR Code Quality Reviewer completed the code quality review. |
|
✅ Design Decision Gate 🏗️ completed the design decision gate check. No ADR enforcement needed: PR #41036 does not have the 'implementation' label and has 85 new lines (<=100 threshold) in business logic directories. |
|
🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅ |
🧪 Test Quality Sentinel Report✅ Test Quality Score: 90/100 — Excellent
📊 Metrics & Test Classification (3 tests analyzed)
Go: 3 (
|
There was a problem hiding this comment.
✅ Test Quality Sentinel: 90/100. Test quality is excellent — 0% of new tests are implementation tests (threshold: 30%). All three new tests verify observable behavioral contracts. One minor note: TestGitHubDotComRESTClientOptions uses bare assert.Equal calls without message arguments.
There was a problem hiding this comment.
Skills-Based Review 🧠
Applied /improve-codebase-architecture, /diagnose, /tdd, and /zoom-out — requesting changes on two correctness issues.
📋 Key Themes & Highlights
Blocking Issues
secretSetClientOptionshost extraction is fragile (secret_set_command.go:128) —strings.TrimPrefixretains any path component from the URL, which would silently misdirect secret writes on GHE instances using(ghe.example.com/redacted) Useurl.Parse` instead.- Missing test coverage for the above (
secret_set_command_test.go) — Neither the `(redacted) branch nor a URL-with-path case is tested, so the bug has no regression guard.
Non-blocking Observations
- Path segments unescaped in
buildContentsAPIPath(remote_fetch.go:962) — Onlyrefis escaped;pathis passed raw, inconsistent withfetchPublicGitHubContentsAPI. Edge case today but worth aligning. http.DefaultClientstill unbounded (remote_fetch.go) —fetchPublicGitHubContentsAPIandresolveRefToSHAViaPublicAPIusehttp.DefaultClient(no timeout), which are outside this PR's diff but directly undercut the "bounded REST clients" claim. Worth a follow-up or a scope note in the PR description.
Positive Highlights
- ✅ The
buildContentsAPIPathhelper cleanly consolidates five duplicatedfmt.Sprintfpaths and fixes the primaryurl.QueryEscape(ref)bug - ✅ The four inlined host-branching blocks collapsed into
createRESTClientForHostis a net −28 line improvement with no behaviour change - ✅ Each new helper comes with focused unit tests —
TestBuildContentsAPIPath,TestGitHubDotComRESTClientOptions, andTestSecretSetClientOptionsare well-structured - ✅ PR description is exceptionally clear, listing all three reliability gaps and mapping changes to each one
🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 74.8 AIC · ⌖ 8.34 AIC · ⊞ 6.5K
| } | ||
|
|
||
| func buildContentsAPIPath(owner, repo, path, ref string) string { | ||
| return fmt.Sprintf("repos/%s/%s/contents/%s?ref=%s", owner, repo, path, url.QueryEscape(ref)) |
There was a problem hiding this comment.
[/improve-codebase-architecture] buildContentsAPIPath escapes ref (fixing the stated bug ✅) but leaves the path segment unencoded — creating an inconsistency with fetchPublicGitHubContentsAPI, which encodes each path segment via url.PathEscape.
💡 Suggestion: align path encoding with fetchPublicGitHubContentsAPI
File names containing %, #, or spaces are rare on GitHub, but the two helpers now use different encoding strategies for the same endpoint. Aligning them removes a latent gap:
func buildContentsAPIPath(owner, repo, path, ref string) string {
segments := strings.Split(path, "/")
for i, s := range segments {
segments[i] = url.PathEscape(s)
}
return fmt.Sprintf("repos/%s/%s/contents/%s?ref=%s",
owner, repo, strings.Join(segments, "/"), url.QueryEscape(ref))
}A test case like buildContentsAPIPath("o", "r", "dir/file with spaces.md", "main") would also serve as a specification even if the fix is deferred.
| Timeout: constants.DefaultHTTPClientTimeout, | ||
| } | ||
| if apiBase != "" { | ||
| opts.Host = strings.TrimPrefix(strings.TrimPrefix(apiBase, "https://"), "http://") |
There was a problem hiding this comment.
[/diagnose] strings.TrimPrefix strips the scheme but silently retains any path component. A URL like (ghe.example.com/redacted) would set Host = "ghe.example.com/api/v3"` — likely sending secrets to the wrong host without an error.
💡 Use url.Parse for robust extraction
if apiBase != "" {
u, err := url.Parse(apiBase)
if err != nil || u.Host == "" {
// fall back to the raw string to preserve existing behaviour
opts.Host = strings.TrimPrefix(strings.TrimPrefix(apiBase, "https://"), "(redacted)
} else {
opts.Host = u.Host
}
}url.Parse cleanly separates host from scheme and path, which is how go-gh expects the Host field to be populated. The fallback preserves backward compatibility for callers that already pass a bare hostname.
| } | ||
| if opts.Timeout != constants.DefaultHTTPClientTimeout { | ||
| t.Fatalf("expected timeout %s, got %s", constants.DefaultHTTPClientTimeout, opts.Timeout) | ||
| } |
There was a problem hiding this comment.
[/tdd] The (redacted) scheme branch in secretSetClientOptionsis untested, and there is no test for a URL that includes a path component (e.g.(ghe.example.com/redacted), which is the exact shape that exposes the TrimPrefix host-extraction bug described on line 128.
💡 Suggested additional test cases
t.Run("strips http scheme", func(t *testing.T) {
opts := secretSetClientOptions("(ghe.example.com/redacted)
if opts.Host != "ghe.example.com" {
t.Fatalf("expected host ghe.example.com, got %q", opts.Host)
}
})
t.Run("URL with path component", func(t *testing.T) {
// (ghe.example.com/redacted) should yield host=ghe.example.com
opts := secretSetClientOptions("(ghe.example.com/redacted)
if opts.Host != "ghe.example.com" {
t.Fatalf("expected host ghe.example.com, got %q", opts.Host)
}
})The second case currently fails with Host = "ghe.example.com/api/v3", making it a useful regression guard for the fix suggested above.
| opts.Host = host | ||
| } | ||
| return api.DefaultRESTClient() | ||
| return api.NewRESTClient(opts) |
There was a problem hiding this comment.
[/zoom-out] The PR's headline goal is "bounded REST clients", but fetchPublicGitHubContentsAPI (used as a last-resort fallback throughout this file) still calls http.DefaultClient.Do(req) — an unbounded client. Two unauthenticated fallback paths are therefore still at risk of hanging indefinitely on slow or stuck networks.
💡 In scope or follow-up?
These callsites (fetchPublicGitHubContentsAPI and resolveRefToSHAViaPublicAPI) are not touched by this PR, so this is a pre-existing gap. It would be worth tracking as a follow-up, or noting explicitly in the PR description that the public-API fallback paths are intentionally out of scope.
A minimal fix would be to replace http.DefaultClient with a package-level client that includes constants.DefaultHTTPClientTimeout:
var defaultHTTPClient = &http.Client{Timeout: constants.DefaultHTTPClientTimeout}There was a problem hiding this comment.
REQUEST_CHANGES — Two high-severity correctness issues must be fixed before merge.
🔴 Blocking issues
1. buildContentsAPIPath leaves the path segment unescaped
url.QueryEscape is applied only to ref; the path argument is interpolated raw via %s. The existing fetchPublicGitHubContentsAPI has an explicit comment explaining why per-segment url.PathEscape is required for nested paths. The new helper regresses that — paths with spaces or reserved characters will produce malformed API requests across every refactored callsite. See inline comment on remote_fetch.go.
2. secretSetClientOptions produces malformed hosts for GitHub Enterprise
strings.TrimPrefix strips only the scheme; a URL like (ghe.example.com/redacted) (the canonical $GITHUB_API_URLshape for GHE) producesHost: "ghe.example.com/api/v3", which go-ghwill not normalise further. The resulting API base URL will be invalid, silently breaking secret-set operations against any GHE instance. See inline comment onsecret_set_command.go`.
i️ Additional finding (non-blocking)
TestBuildContentsAPIPath uses only URL-safe paths, so the test suite provides zero coverage for the path-encoding bug — see inline comment on remote_fetch_test.go.
Warning
Firewall blocked 1 domain
The following domain was blocked by the firewall during workflow execution:
proxy.golang.org
To allow these domains, add them to the
network.allowedlist in your workflow frontmatter:
network:
allowed:
- defaults
- "proxy.golang.org"See Network Configuration for more information.
🔎 Code quality review by PR Code Quality Reviewer · 96 AIC · ⌖ 7.63 AIC · ⊞ 5.1K
|
|
||
| func TestBuildContentsAPIPath(t *testing.T) { | ||
| t.Run("escapes refs with reserved query chars", func(t *testing.T) { | ||
| got := buildContentsAPIPath("owner", "repo", ".github/workflows/demo.md", "release+candidate#1") |
There was a problem hiding this comment.
The path argument in both test cases is .github/workflows/demo.md — a string with zero URL-reserved characters — so the tests provide no coverage for path-segment encoding.
💡 Why this matters / suggested addition
The companion review comment on buildContentsAPIPath notes that path is interpolated raw while ref is url.QueryEscaped. These tests would pass identically whether path is escaped or not, leaving that gap completely invisible in CI.
A minimal failing case that documents the gap:
t.Run("does not escape path segments (known gap)", func(t *testing.T) {
got := buildContentsAPIPath("owner", "repo", "path with spaces/file.md", "main")
// Currently: "repos/owner/repo/contents/path with spaces/file.md?ref=main"
// Expected after fix: "repos/owner/repo/contents/path%20with%20spaces/file.md?ref=main"
want := "repos/owner/repo/contents/path%20with%20spaces/file.md?ref=main"
if got != want {
t.Fatalf("buildContentsAPIPath() = %q, want %q", got, want)
}
})Adding this (or a similar case) makes the path-escaping regression test-visible and gives reviewers a concrete failing test to point to.
|
@copilot run pr-finisher skill |
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Completed a pr-finisher pass for PR #41036. Addressed the blocking review feedback in commits |
go-ghusage ingh-awhad three reliability gaps: unsafe Contents API path construction, duplicated host-aware REST client creation, and unbounded REST clients in key callsites. This change applies a focused mechanical cleanup to make request construction and client behavior consistent and safer under edge-case refs/paths and slow or hung networks.Contents API path correctness
buildContentsAPIPath(owner, repo, path, ref)helper inpkg/parser/remote_fetch.go.url.PathEscape(per segment)refquery values viaurl.QueryEscapeRemote fetch client deduplication
createRESTClientForHost(host)at previously inlined callsites inremote_fetch.go(ListWorkflowFiles*,ListDirAllFiles*, recursive listing, and subdir listing).Bounded REST clients
createRESTClientForHostto setapi.ClientOptions.Timeout = constants.DefaultHTTPClientTimeout.pkg/cli/update_check.goto construct its github.com REST client via a shared options helper that includes timeout.pkg/cli/secret_set_command.goto build client options through a helper that preserves host override behavior (--api-url), strips API path components (for example/api/v3), normalizesapi.github.comtogithub.com, and sets timeout.Focused unit coverage
https://ghe.example.com/api/v3https://api.github.com→github.com