Skip to content

Fix fork-aware stack detection#554

Merged
mariusvniekerk merged 14 commits into
mainfrom
fix-stacked-pr-display
Jun 20, 2026
Merged

Fix fork-aware stack detection#554
mariusvniekerk merged 14 commits into
mainfrom
fix-stacked-pr-display

Conversation

@mariusvniekerk

@mariusvniekerk mariusvniekerk commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator

Stack detection was using branch names as identity, which lets fork PRs that reuse upstream branch names hide real same-repo stacks. This updates detection to use repo-aware branch keys, keeps true same-repo self-edges from poisoning stack roots, and makes GitLab fork identity explicit instead of silently collapsing unresolved fork sources to the target repo.

Validation: focused stack, GitLab, and server sync/API tests passed; make test-short passed.

mariusvniekerk and others added 2 commits June 19, 2026 13:21
Historical PR rows can have the same head and base branch, including main -> main. Treating those rows as stack parents made the detector believe the default branch was itself a PR head, so real open chains targeting main never became stack roots and the stack API returned 404 for valid stacked PRs.

Ignore self-targeting PR rows during chain detection because they do not encode a dependency edge and can only obscure real branch relationships.

Generated with Codex
Co-authored-by: Codex <codex@openai.com>
The stack detector intentionally ignores rows where head and base branch are identical, but the filter looked like a mysterious deletion without the historical provider-row context. Make the default-branch failure mode explicit at the call site so future maintainers do not remove the guard while cleaning up the detector.

Validation: go test ./internal/stacks -shuffle=on

Generated with Codex
Co-authored-by: Codex <codex@openai.com>
@roborev-ci

roborev-ci Bot commented Jun 19, 2026

Copy link
Copy Markdown

roborev: Combined Review (34d038f)

Medium confidence: one medium test coverage gap should be addressed before merge.

Medium

  • internal/stacks/detect.go:17 - The stack-detection fix only has unit coverage at the DetectChains layer. The user-visible path is sync/SQLite -> RunDetection -> /stacks and PR stack APIs, so the regression where a self-targeting default-branch PR hides a real stack is not pinned through the full stack required by project guidelines.

    Suggested fix: Add an API/e2e test that seeds or syncs a merged main -> main PR plus an open chain rooted on main, runs production detection, and asserts /stacks or /pulls/{...}/stack returns the expected stack members.


Panel: ci_default_security | Synthesis: codex, 10s | Members: codex_default (codex/default, done, 1m40s), codex_security (codex/security, done, 10s) | Total: 2m0s

mariusvniekerk and others added 7 commits June 19, 2026 13:31
The previous comment could read as if the broken PR itself was self-targeting. The real failure was an unrelated historical self-targeting row poisoning the branch head map and preventing valid roots targeting that branch from being discovered.

Spell out that mechanism at the filter so the guard is understandable without reconstructing the detector state.

Validation: go test ./internal/stacks -shuffle=on

Generated with Codex
Co-authored-by: Codex <codex@openai.com>
The stack filter comment should explain the detector invariant without implying an impossible provider-level PR shape. Describe the same-branch row as unable to form a parent-child edge and explain how it would poison the branch map.

Validation: go test ./internal/stacks -shuffle=on

Generated with Codex
Co-authored-by: Codex <codex@openai.com>
The stack detector regression test should cover malformed same-branch rows without implying that a default branch can be a valid self-targeting PR. Use a neutral legacy branch fixture and name the test around the malformed row behavior.

Validation: go test ./internal/stacks -shuffle=on

Generated with Codex
Co-authored-by: Codex <codex@openai.com>
Stack detection was comparing only branch names. A fork PR such as fork:main -> upstream:main or fork:feature -> upstream:main could therefore make the upstream branch appear to be a PR head and hide real stacks rooted on that branch.

Use repo+branch graph keys instead: head nodes come from head_repo_clone_url plus head_branch, and base nodes come from the tracked repository clone URL plus base_branch. This models fork PRs as separate graph nodes while keeping same-repo stacked PRs connected.

Validation: go test ./internal/stacks -shuffle=on; go test ./internal/server -run 'TestAPI(GetPullDetailIncludesStackContext|GetStackForPR|Stacks_DetectionViaSyncHook)' -shuffle=on\n\nGenerated with Codex\nCo-authored-by: Codex <codex@openai.com>
Repo-aware stack detection still needs to reject rows whose head and base resolve to the exact same repository branch. Those rows cannot describe a parent-child dependency, and letting them into the head map can hide real stacks rooted on that branch.

Keep fork PRs such as fork:main -> upstream:main because their repo-aware keys differ, while filtering only true same-repo self-edges.

Validation: go test ./internal/stacks -shuffle=on; go test ./internal/server -run 'TestAPI(GetPullDetailIncludesStackContext|GetStackForPR|Stacks_DetectionViaSyncHook)' -shuffle=on; make test-short

Generated with Codex
Co-authored-by: Codex <codex@openai.com>
Stack detection relies on repo identity as much as branch identity. The repo-aware graph needs normalized clone URL keys, provider metadata for GitLab fork sources, and full sync-to-API coverage so provider normalization and persistence cannot silently regress the stack UI path.

This keeps fork branch names from shadowing same-repo stacks across the production sync hook while preserving the same-repo self-edge guard for imported or malformed rows.

Validation: go test ./internal/stacks -shuffle=on; go test ./internal/platform/gitlab -run 'TestClientListOpenMergeRequestsPopulatesForkHeadRepoCloneURL|TestClientLooksUpProjectByRawPathAndUsesNumericIDAfterLookup' -shuffle=on; go test ./internal/server -run 'TestAPIStacks_DetectionViaSyncHook(IgnoresForkHeadBranchCollision)?$|TestAPI(GetPullDetailIncludesStackContext|GetStackForPR)' -shuffle=on; git diff --check; make test-short

Generated with Codex
Co-authored-by: Codex <codex@openai.com>
GitLab fork source metadata must not collapse to the target repo when source-project lookup fails. Returning an empty head repo identity would let stack detection fall back to the tracked repo and reintroduce the branch-shadowing bug for inaccessible or deleted fork sources.

Surface that lookup failure as an explicit sync/detail error, and cover both the failure path and the same-repo self-edge sync/API regression requested by roborev.

Validation: go test ./internal/platform/gitlab -run 'TestClientListOpenMergeRequests(PopulatesForkHeadRepoCloneURL|FailsWhenForkHeadRepoLookupFails)|TestClientLooksUpProjectByRawPathAndUsesNumericIDAfterLookup' -shuffle=on; go test ./internal/stacks -shuffle=on; go test ./internal/server -run 'TestAPIStacks_DetectionViaSyncHook(IgnoresForkHeadBranchCollision|IgnoresSameRepoSelfEdge)?$' -shuffle=on; git diff --check; make test-short

Generated with Codex
Co-authored-by: Codex <codex@openai.com>
@mariusvniekerk mariusvniekerk changed the title Ignore self-targeting PRs when detecting stack chains Fix fork-aware stack detection Jun 19, 2026
@roborev-ci

roborev-ci Bot commented Jun 19, 2026

Copy link
Copy Markdown

roborev: Combined Review (79d6d91)

GitLab MR sync can be blocked by inaccessible fork metadata; one Medium finding.

Medium

  • internal/platform/gitlab/client.go:319
    GitLab MR listing/detail now fails the entire request when fetching /projects/{source_project_id} for a fork MR fails. A fork source project can be private, deleted, or otherwise inaccessible while the target MR remains visible, so one bad fork state can prevent all GitLab MR data for that repo from refreshing and leave the maintainer console stale.

    Fix: make fork source-project clone URL enrichment non-fatal. Preserve fork/source identity for stack detection with stable non-clone metadata or skip only unsafe stack edges, while still returning the MR data.


Panel: ci_default_security | Synthesis: codex, 6s | Members: codex_default (codex/default, done, 4m38s), codex_security (codex/security, done, 1m58s) | Total: 6m42s

GitLab can expose a target merge request even when its fork source project is private, deleted, or otherwise unreadable. Treating the source-project clone URL lookup as required made one bad fork block the whole repo refresh.

Keep that lookup as optional enrichment while preserving cancellation behavior. When the fork clone URL is unavailable, stack detection prefers candidates with explicit head repo identity so an unknown fork branch cannot shadow a known same-repo stack branch.

Validation: go test ./internal/platform/gitlab ./internal/stacks -shuffle=on; go test ./internal/server -run 'TestAPIStacks_DetectionViaSyncHook(IgnoresForkHeadBranchCollision|IgnoresSameRepoSelfEdge)?$' -shuffle=on; git diff --check

Generated with Codex
Co-authored-by: Codex <codex@openai.com>
@roborev-ci

roborev-ci Bot commented Jun 19, 2026

Copy link
Copy Markdown

roborev: Combined Review (778c60d)

Summary verdict: One medium correctness issue remains; no security findings were reported.

Medium

  • internal/platform/gitlab/client.go:366 - GitLab fork MRs whose source project lookup fails can end up with an empty HeadRepoCloneURL. Stack detection treats an empty head repo as the target repo, so an unresolved fork MR may be incorrectly chained with same-repo PRs when branch names collide.

    Suggested fix: preserve a distinct unresolved-fork identity or explicitly exclude unresolved fork heads from stack edges instead of collapsing them to empty/target repo. Add a regression test for the GitLab failed-lookup collision case.


Panel: ci_default_security | Synthesis: codex, 7s | Members: codex_default (codex/default, done, 4m42s), codex_security (codex/security, done, 2m17s) | Total: 7m6s

GitLab can now return fork MRs without a readable source-project clone URL. Those rows must still sync, but using an empty head repo as if it were the target repo can create false stack edges when branch names collide.

Require stack graph edges to have a known head repo identity. Same-repo provider fixtures now carry the target clone URL explicitly, while unknown fork heads remain visible as MRs but do not participate in stack detection until their source identity is known.

Validation: go test ./internal/platform/gitlab ./internal/stacks -shuffle=on; go test ./internal/server -run 'TestAPIStacks_DetectionViaSyncHook(IgnoresForkHeadBranchCollision|IgnoresSameRepoSelfEdge)?$' -shuffle=on; go test ./internal/server -run 'TestAPI(ListStacks|GetStackForPR|GetPullDetailIncludesStackContext|GetStackForPR_DraftNotBaseReady|GetStackForPR_BaseBranchNotMain|GetStackForPR_SingleFailingIsInProgress)$' -shuffle=on; git diff --check; make test-short

Generated with Codex
Co-authored-by: Codex <codex@openai.com>
@roborev-ci

roborev-ci Bot commented Jun 19, 2026

Copy link
Copy Markdown

roborev: Combined Review (52fc5d9)

Review verdict: changes need fixes before merge due to one high-severity stack detection regression and one medium-severity GitLab sync error-handling issue.

High

  • internal/stacks/detect.go:241: Stack detection now depends on middleman_repos.clone_url, but that value can still be empty on the normal GitHub sync path when repo identity is already known. Same-repo head keys can use the PR clone URL while base keys use "", so valid stacks are not detected and existing stacks may be deleted after sync.
    • Fix: Ensure repo clone metadata is persisted before stack detection for all sync paths, or pass/fallback to the configured/resolved repo clone URL when DB metadata is empty. Add coverage for a pre-resolved repo with PlatformExternalID on a fresh DB.

Medium

  • internal/platform/gitlab/client.go:359: optionalHeadRepoCloneURL suppresses every source-project lookup error except context cancellation, including rate-limit and transient server errors. Those failures become an empty HeadRepoCloneURL, can overwrite previously known identity during sync, and cause stack detection to drop affected MRs while the sync is reported as successful.
    • Fix: Only suppress expected inaccessible-source cases such as 404/403, and propagate retryable/rate-limit/server errors or preserve the previously stored non-empty head repo identity.

Panel: ci_default_security | Synthesis: codex, 9s | Members: codex_default (codex/default, done, 4m54s), codex_security (codex/security, done, 13s) | Total: 5m16s

The GitLab fork fix relies on preserving MRs whose source project metadata is inaccessible while refusing to infer stack edges from an unknown head repo. A unit-level detector check was not enough evidence for the PR review because regressions can happen across provider sync, persistence, and API stack decoration.

Add a full sync-hook/API regression that syncs a GitLab MR with an empty head clone URL alongside a known same-repo stack. The unknown fork MR remains visible through list/detail but is absent from both per-PR and list stack responses.

Validation: go test ./internal/server -run 'TestAPIStacks_GitLabUnknownForkHeadSyncsButSkipsStackEdges$' -shuffle=on; go test ./internal/platform/gitlab ./internal/stacks -shuffle=on; go test ./internal/server -run 'TestAPIStacks_(GitLabUnknownForkHeadSyncsButSkipsStackEdges|DetectionViaSyncHook|DetectionViaSyncHookIgnoresForkHeadBranchCollision|DetectionViaSyncHookIgnoresSameRepoSelfEdge)$' -shuffle=on; git diff --check. make test-short was also run and failed in unrelated internal/server/e2etest TestFleetSnapshotEmptyTmuxServerE2E after tmux list-sessions reported killed/deadline-exceeded operations.\n\nGenerated with Codex\nCo-authored-by: Codex <codex@openai.com>
@roborev-ci

roborev-ci Bot commented Jun 19, 2026

Copy link
Copy Markdown

roborev: Combined Review (b68234e)

High: stack detection can disappear for normal GitHub repos

  • Location: internal/stacks/detect.go:241
  • Problem: Stack detection now keys base branches with repo.CloneURL, but the normal config-resolved GitHub sync path has RepoRef.PlatformRepoID set, so syncRepoIdentity skips returning a resolved repo. The GitHub refreshRepoSettings branch only writes merge settings, not clone_url, leaving middleman_repos.clone_url empty. Same-repo PR heads have non-empty HeadRepoCloneURL, so base/head keys never match and stacks disappear for normal GitHub repos.
  • Fix: Persist GitHub provider metadata (clone_url, web URL, default branch, provider ID) in the GitHub refresh path before stack detection, and add coverage using a pre-resolved GitHub RepoRef with PlatformRepoID.

Panel: ci_default_security | Synthesis: codex, 7s | Members: codex_default (codex/default, done, 5m27s), codex_security (codex/security, done, 2m15s) | Total: 7m49s

The full-stack e2e seed data needs to represent the same-repo stack with the same stable identity the stack detector now requires. After unsafe fork edges stopped participating in stack detection, the seeded acme/tools rows had empty clone identity and the browser stack-status tests stopped rendering stack context.

Keep the fixture data aligned with the production invariant so the e2e suite exercises a known same-repo stack instead of an unknown head repository.

Validation: cd frontend && ../node_modules/.bin/vp exec -- playwright test --config=playwright-e2e.config.ts --project=chromium tests/e2e-full/stack-status.spec.ts; cd frontend && ../node_modules/.bin/vp exec -- playwright test --config=playwright-e2e.config.ts --project=firefox tests/e2e-full/stack-status.spec.ts; go test ./internal/testutil -run 'TestSeedFixtures' -shuffle=on; cd frontend && CI=1 ../node_modules/.bin/vp exec -- playwright test --config=playwright-e2e.config.ts --project=chromium; cd frontend && CI=1 ../node_modules/.bin/vp exec -- playwright test --config=playwright-e2e.config.ts --project=firefox; git diff --check

Generated with Codex
Co-authored-by: Codex <codex@openai.com>
@roborev-ci

roborev-ci Bot commented Jun 19, 2026

Copy link
Copy Markdown

roborev: Combined Review (8101d80)

Medium-risk issues remain in stack detection metadata handling; no security findings were reported.

Medium

  • internal/platform/gitlab/client.go:359
    GitLab source-project lookup failures are all treated as “unknown head repo.” A transient 429/5xx/network error can return an empty HeadRepoCloneURL, and the normal MR upsert may overwrite a previously known fork identity, causing stack detection to skip or delete affected stacks.
    Fix: Only suppress expected unavailable-source cases like 404/403, and propagate transient/rate-limit errors or preserve the existing non-empty head repo URL when lookup fails.

  • internal/stacks/detect.go:27
    Stack detection now drops every PR with an empty HeadRepoCloneURL. Providers that do not populate this field on normal PR list responses, such as the gitealike path when head.repository is absent, will produce no detected stacks and RunDetection may delete existing stack rows.
    Fix: Ensure each provider fills same-repo head clone URLs before persistence, or make RunDetection skip stale deletion when required head/target repo identity is unavailable.


Panel: ci_default_security | Synthesis: codex, 8s | Members: codex_default (codex/default, done, 5m40s), codex_security (codex/security, done, 1m46s) | Total: 7m34s

GitLab source-project lookup is optional only when the source project is actually unavailable to the token, such as 403 or 404. Treating rate limits and upstream failures the same way would persist an empty head clone URL and make an existing fork head look like an invisible foreign head until the next successful sync.

Keep unavailable fork heads out of stack edges, but fail transient lookup errors so sync does not overwrite better provider data with a downgraded row.

Container evidence: Gitea 1.24.6 and Forgejo 12 same-repo pull requests both returned head.repo_id and head.repo.clone_url; GitLab CE 18.9.5 same-repo merge requests returned source_project_id == target_project_id and a resolvable project http_url_to_repo.

Validation: go test ./internal/platform/gitlab -run 'TestClient(ListOpenMergeRequestsPropagatesTransientForkHeadRepoLookupFailures|GetMergeRequestPropagatesTransientForkHeadRepoLookupFailures|ListOpenMergeRequestsContinuesWhenForkHeadRepoLookupFails|GetMergeRequestContinuesWhenForkHeadRepoLookupFails)$' -shuffle=on; go test ./internal/stacks ./internal/platform/gitea ./internal/platform/forgejo ./internal/platform/gitealike -shuffle=on; go test ./internal/server -run 'TestAPIStacks_GitLabUnknownForkHeadSyncsButSkipsStackEdges$' -shuffle=on; go test ./internal/platform/gitlab -shuffle=on; git diff --check

Generated with Codex
Co-authored-by: Codex <codex@openai.com>
@roborev-ci

roborev-ci Bot commented Jun 20, 2026

Copy link
Copy Markdown

roborev: Combined Review (d2a3f2e)

No issues found.


Panel: ci_default_security | Synthesis: codex | Members: codex_default (codex/default, done, 10m38s), codex_security (codex/security, done, 40s) | Total: 11m18s

@mariusvniekerk mariusvniekerk merged commit b29a71c into main Jun 20, 2026
16 checks passed
@mariusvniekerk mariusvniekerk deleted the fix-stacked-pr-display branch June 20, 2026 00:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant