Skip to content

Commit 075b15c

Browse files
authored
feat(workflow): cache repository owner-type API call once per compilation run (#40258)
1 parent f33cd25 commit 075b15c

6 files changed

Lines changed: 163 additions & 7 deletions
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
# ADR-40258: Cache Repository Owner-Type Lookup Per Compilation Run
2+
3+
**Date**: 2026-06-19
4+
**Status**: Draft
5+
6+
## Context
7+
8+
When determining whether to suppress the `copilot-requests: write` permission tip for individual-user repositories, the compiler calls `gh api /users/<owner>` ("Checking repository owner type...") to classify the owner as a `User` or `Organization`. This classification depends only on the owner login, not on the individual workflow file. A single `gh aw compile` run compiles every workflow in the repository on one shared `Compiler` instance, so a repository with N Copilot workflows previously issued N identical `/users/<owner>` round-trips — wasted latency and API quota for a value that never changes within a run. An owner-type cache (`ownerTypeCache map[string]string`, keyed by owner login) was introduced to deduplicate these lookups, and this PR finalizes that strategy by making the cache's initialization eager and unconditional.
9+
10+
## Decision
11+
12+
We will cache the repository owner-type lookup once per owner login on the `Compiler` instance, so that `repositoryOwnerIsIndividualUser` performs at most one `gh api /users/<owner>` call per owner for the lifetime of a compilation run. We will initialize `ownerTypeCache` eagerly in `NewCompiler()` — alongside the existing `actionPinWarnings` and `priorManifests` caches — rather than lazily on first use. This makes the always-initialized invariant explicit at construction time and lets `repositoryOwnerIsIndividualUser` drop its defensive nil-guard, keeping the lookup path linear and free of incidental initialization branches.
13+
14+
## Alternatives Considered
15+
16+
### Alternative 1: Lazy initialization with a nil-guard at the call site
17+
Keep `ownerTypeCache` zero-valued (nil) and initialize it on first access inside `repositoryOwnerIsIndividualUser`. This was the prior behavior. Rejected because it spreads the cache's lifecycle across two locations, forces every reader to re-assert the invariant, and is inconsistent with the sibling caches that are already initialized in `NewCompiler()`.
18+
19+
### Alternative 2: No caching — look up owner type on every call
20+
Issue `gh api /users/<owner>` each time the permission tip is evaluated. Rejected because it reintroduces N redundant identical API calls per multi-workflow compile run, adding network latency and consuming API rate limit for a value that is constant within the run.
21+
22+
## Consequences
23+
24+
### Positive
25+
- At most one owner-type API call per owner per compilation run, eliminating redundant round-trips for multi-workflow repositories.
26+
- The always-initialized cache invariant is established in one place (`NewCompiler()`), consistent with the other compiler caches.
27+
- The lookup path in `repositoryOwnerIsIndividualUser` is simpler without the nil-guard branch.
28+
29+
### Negative
30+
- The cache lives for the full lifetime of the `Compiler` instance; a long-lived or reused compiler would not observe an owner's type changing between `User` and `Organization` mid-process (an acceptable trade-off given owner type is effectively stable).
31+
- A negative result (empty string from a prior API error) is also cached, so a transient `gh api` failure suppresses retries for that owner for the rest of the run.
32+
33+
### Neutral
34+
- The cache is keyed by owner login only, so two repositories under the same owner share one entry.
35+
- Correctness of the eager-init refactor is covered by `compiler_owner_type_cache_test.go`, which exercises cache-hit, malformed-slug, cross-compilation reuse, and non-nil-map invariants.
36+
37+
---
38+
39+
*This is a DRAFT ADR generated by the [Design Decision Gate](https://github.com/github/gh-aw/actions/runs/27828047748) workflow. The PR author must review, complete, and finalize this document before the PR can merge.*
Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
//go:build !integration
2+
3+
package workflow
4+
5+
import (
6+
"testing"
7+
)
8+
9+
// TestRepositoryOwnerIsIndividualUser_CacheHit verifies that repositoryOwnerIsIndividualUser
10+
// returns the cached owner type without making a network call when the owner is already
11+
// in the cache. This ensures the owner-type check is performed at most once per repo
12+
// during a single compilation run.
13+
func TestRepositoryOwnerIsIndividualUser_CacheHit(t *testing.T) {
14+
tests := []struct {
15+
name string
16+
cachedType string
17+
expectedResult bool
18+
}{
19+
{
20+
name: "cached User type returns true",
21+
cachedType: "User",
22+
expectedResult: true,
23+
},
24+
{
25+
name: "cached Organization type returns false",
26+
cachedType: "Organization",
27+
expectedResult: false,
28+
},
29+
{
30+
name: "cached empty string (API error) returns false",
31+
cachedType: "",
32+
expectedResult: false,
33+
},
34+
}
35+
36+
for _, tt := range tests {
37+
t.Run(tt.name, func(t *testing.T) {
38+
c := NewCompiler()
39+
c.SetRepositorySlug("myowner/myrepo")
40+
41+
// Pre-populate the cache so no network call is made.
42+
// If the cache is not consulted, RunGH would be called without a real gh binary
43+
// available in unit tests, causing the function to return false even for the
44+
// "User" case — the test would then fail on that case.
45+
c.ownerTypeCache["myowner"] = tt.cachedType
46+
47+
got := c.repositoryOwnerIsIndividualUser()
48+
if got != tt.expectedResult {
49+
t.Errorf("repositoryOwnerIsIndividualUser() = %v, want %v (cached owner type %q)",
50+
got, tt.expectedResult, tt.cachedType)
51+
}
52+
})
53+
}
54+
}
55+
56+
// TestRepositoryOwnerIsIndividualUser_MalformedSlug verifies that the function
57+
// returns false immediately when repositorySlug is missing or malformed, without
58+
// consulting the cache or making a network call.
59+
func TestRepositoryOwnerIsIndividualUser_MalformedSlug(t *testing.T) {
60+
tests := []struct {
61+
name string
62+
slug string
63+
}{
64+
{name: "empty slug", slug: ""},
65+
{name: "no slash", slug: "owneronly"},
66+
{name: "trailing slash only", slug: "/"},
67+
{name: "missing owner", slug: "/repo"},
68+
{name: "missing repo", slug: "owner/"},
69+
}
70+
71+
for _, tt := range tests {
72+
t.Run(tt.name, func(t *testing.T) {
73+
c := NewCompiler()
74+
c.SetRepositorySlug(tt.slug)
75+
76+
got := c.repositoryOwnerIsIndividualUser()
77+
if got {
78+
t.Errorf("repositoryOwnerIsIndividualUser() = true for malformed slug %q, want false", tt.slug)
79+
}
80+
})
81+
}
82+
}
83+
84+
// TestRepositoryOwnerIsIndividualUser_CacheSharedAcrossCompilations verifies that
85+
// the owner-type cache persists across multiple calls on the same Compiler instance,
86+
// which represents multiple workflow files compiled in a single `gh aw compile` run.
87+
func TestRepositoryOwnerIsIndividualUser_CacheSharedAcrossCompilations(t *testing.T) {
88+
c := NewCompiler()
89+
c.SetRepositorySlug("myorg/repo-a")
90+
91+
// Seed the cache as if a previous call already resolved the owner type.
92+
c.ownerTypeCache["myorg"] = "Organization"
93+
94+
// Simulate compiling a second workflow in the same repo (different repo name, same owner).
95+
c.SetRepositorySlugIfUnlocked("myorg/repo-b")
96+
97+
// The cache entry for "myorg" must be reused — no network call is made.
98+
got := c.repositoryOwnerIsIndividualUser()
99+
if got {
100+
t.Error("repositoryOwnerIsIndividualUser() = true for Organization owner, want false")
101+
}
102+
103+
// The cache should still hold the original value (not been cleared or overwritten).
104+
if val := c.ownerTypeCache["myorg"]; val != "Organization" {
105+
t.Errorf("ownerTypeCache[myorg] = %q after second call, want %q", val, "Organization")
106+
}
107+
}
108+
109+
// TestRepositoryOwnerIsIndividualUser_CacheInitializedByNewCompiler verifies that
110+
// NewCompiler initializes ownerTypeCache so callers never encounter a nil map panic.
111+
func TestRepositoryOwnerIsIndividualUser_CacheInitializedByNewCompiler(t *testing.T) {
112+
c := NewCompiler()
113+
if c.ownerTypeCache == nil {
114+
t.Fatal("NewCompiler() left ownerTypeCache nil; expected an initialized map")
115+
}
116+
}

pkg/workflow/compiler_types.go

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,7 @@ type Compiler struct {
107107
requireDocker bool // If true, fail validation when Docker is not available instead of silently skipping
108108
ghesCompatFromCLI bool // If true, GHES compat was requested via --ghes CLI flag (takes precedence over aw.json)
109109
ghesArtifactCompat bool // If true, emit GHES-compatible v3.x pins for artifact actions instead of the latest v7/v8
110-
ownerTypeCache map[string]string // Cached GitHub owner type ("User"/"Organization"/"") keyed by owner login
110+
ownerTypeCache map[string]string // Cached GitHub owner type ("User"/"Organization"/"") keyed by owner login; not goroutine-safe (Compiler is used sequentially)
111111
}
112112

113113
// NewCompiler creates a new workflow compiler with functional options.
@@ -136,7 +136,8 @@ func NewCompiler(opts ...CompilerOption) *Compiler {
136136
artifactManager: NewArtifactManager(),
137137
actionPinWarnings: make(map[string]bool), // Initialize warning cache
138138
priorManifests: make(map[string]*GHAWManifest),
139-
gitRoot: gitRoot, // Auto-detected git root
139+
ownerTypeCache: make(map[string]string), // Initialize owner-type cache (keyed by owner login)
140+
gitRoot: gitRoot, // Auto-detected git root
140141
}
141142

142143
// Apply functional options

pkg/workflow/permissions_compiler_validator.go

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -186,15 +186,15 @@ func (c *Compiler) repositoryOwnerIsIndividualUser() bool {
186186
return false
187187
}
188188

189-
if c.ownerTypeCache == nil {
190-
c.ownerTypeCache = make(map[string]string)
191-
}
192189
ownerType, cached := c.ownerTypeCache[owner]
193190
if !cached {
194191
workflowLog.Printf("Checking owner type for: %s", owner)
195192
output, err := RunGH("Checking repository owner type...", "api", "/users/"+owner, "--jq", ".type")
196193
if err != nil {
197194
workflowLog.Printf("Could not determine owner type for %q: %v", owner, err)
195+
// Cache the empty string so subsequent calls for the same owner also return false
196+
// without retrying. This is intentional: fail-safe means "show the tip when uncertain"
197+
// and avoids N retry round-trips per run.
198198
c.ownerTypeCache[owner] = ""
199199
return false
200200
}

pkg/workflow/safe_jobs.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -325,7 +325,7 @@ func extractSafeJobsFromFrontmatter(frontmatter map[string]any) map[string]*Safe
325325
if safeOutputsMap, ok := safeOutputs.(map[string]any); ok {
326326
if jobs, exists := safeOutputsMap["jobs"]; exists {
327327
if jobsMap, ok := jobs.(map[string]any); ok {
328-
c := &Compiler{} // Create a temporary compiler instance for parsing
328+
c := NewCompiler() // Create a temporary compiler instance for parsing
329329
return c.parseSafeJobsConfig(jobsMap)
330330
}
331331
}

pkg/workflow/safe_outputs_config.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -722,7 +722,7 @@ func (c *Compiler) extractSafeOutputsConfig(frontmatter map[string]any) *SafeOut
722722
// Handle jobs (safe-jobs must be under safe-outputs)
723723
if jobs, exists := outputMap["jobs"]; exists {
724724
if jobsMap, ok := jobs.(map[string]any); ok {
725-
c := &Compiler{} // Create a temporary compiler instance for parsing
725+
c := NewCompiler() // Create a temporary compiler instance for parsing
726726
config.Jobs = c.parseSafeJobsConfig(jobsMap)
727727
}
728728
}

0 commit comments

Comments
 (0)