Skip to content

Fix: propagate request context on allocation#4415

Merged
markmandel merged 4 commits intoagones-dev:mainfrom
lacroixthomas:bugfixes/propagate-request-context-in-allocation
Mar 26, 2026
Merged

Fix: propagate request context on allocation#4415
markmandel merged 4 commits intoagones-dev:mainfrom
lacroixthomas:bugfixes/propagate-request-context-in-allocation

Conversation

@lacroixthomas
Copy link
Copy Markdown
Collaborator

@lacroixthomas lacroixthomas commented Jan 11, 2026

What type of PR is this?
/kind bug

What this PR does / Why we need it:

Added the request context to the allocator allocation instead of the main context

Which issue(s) this PR fixes:

Closes #4408

Special notes for your reviewer:

@github-actions github-actions Bot added kind/bug These are bugs. size/S labels Jan 11, 2026
@lacroixthomas lacroixthomas marked this pull request as ready for review January 11, 2026 00:35
@agones-bot
Copy link
Copy Markdown
Collaborator

Build Succeeded 🥳

Build Id: 57c4273e-9add-4eef-b34f-48073fd532e7

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

git fetch https://github.com/googleforgames/agones.git pull/4415/head:pr_4415 && git checkout pr_4415
helm install agones ./install/helm/agones --namespace agones-system --set agones.image.registry=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.55.0-dev-9460a83

@lacroixthomas lacroixthomas marked this pull request as draft January 13, 2026 00:04
@agones-bot
Copy link
Copy Markdown
Collaborator

Build Failed 😭

Build Id: e4e18316-db33-4eff-80fd-c949dcd750cd

Status: FAILURE

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@lacroixthomas
Copy link
Copy Markdown
Collaborator Author

Keeping it in draft, will check if this could be cleaner and also need to add unit tests around it

@agones-bot
Copy link
Copy Markdown
Collaborator

Build Failed 😭

Build Id: 8d300a24-b7b5-4ce2-bb25-c5f84bc49da6

Status: FAILURE

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@agones-bot
Copy link
Copy Markdown
Collaborator

Build Succeeded 🥳

Build Id: 31632b0b-55a5-43fd-9cff-b75b42e07691

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

git fetch https://github.com/googleforgames/agones.git pull/4415/head:pr_4415 && git checkout pr_4415
helm install agones ./install/helm/agones --namespace agones-system --set agones.image.registry=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.56.0-dev-505500d

Comment on lines +465 to +469
// Avoid pushing request to the queue if the context is already done to prevent
// the pendingRequests reaching the max buffer size and blocking further allocations
if ctx.Err() != nil {
return nil, ErrTotalTimeoutExceeded
}
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be overkill 🤔

@agones-bot
Copy link
Copy Markdown
Collaborator

Build Succeeded 🥳

Build Id: b050e460-8da3-4012-b998-1d3737ef3231

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

git fetch https://github.com/googleforgames/agones.git pull/4415/head:pr_4415 && git checkout pr_4415
helm install agones ./install/helm/agones --namespace agones-system --set agones.image.registry=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.56.0-dev-0f1cce9

@markmandel
Copy link
Copy Markdown
Member

@Reasonably - just checking if you had a chance to have a look at this implementation?

Comment thread pkg/gameserverallocations/allocator.go Outdated
case req := <-c.pendingRequests:
// Ensure the request context is still active before processing to avoid allocating unnecessarily
if req.ctx.Err() != nil {
req.response <- response{request: req, gs: nil, err: ErrTotalTimeoutExceeded}
Copy link
Copy Markdown

@Reasonably Reasonably Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If allocate() already returned via ctx.Done(), nobody is reading req.response. Since the channel is unbuffered, wouldn't this block forever?

Should this be a non-blocking send? Or how about using a buffered channel?

  if req.ctx.Err() != nil {                                                                                                                                                                                                                                                                                                                                                  
      select {                                                                                                                                                                                                                                                                                                                                                               
      case req.response <- response{request: req, gs: nil, err: ErrTotalTimeoutExceeded}:                                                                                                                                                                                                                                                                                    
      default:                                                                                                                                                                                                                                                                                                                                                               
      }                                                                                                                                                                                                                                                                                                                                                                      
      continue                                                                                                                                                                                                                                                                                                                                                               
  }  

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's a good point, nice catch ! - will try to have a look at that later this week

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the commit that changed to use a buffered channel.
LGTM 👍

@lacroixthomas
Copy link
Copy Markdown
Collaborator Author

Didn't forget this one - will take some time tomorrow evening to address the comment, been quite busy recently

@agones-bot
Copy link
Copy Markdown
Collaborator

Build Succeeded 🥳

Build Id: 227a4c76-2efe-4448-8f4f-3d8bf7ee562b

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

git fetch https://github.com/googleforgames/agones.git pull/4415/head:pr_4415 && git checkout pr_4415
helm install agones ./install/helm/agones --namespace agones-system --set agones.image.registry=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.56.0-dev-95c9d90

@agones-bot
Copy link
Copy Markdown
Collaborator

Build Failed 😭

Build Id: 9a70c7b2-856c-46b2-8507-55ef4e47abe0

Status: FAILURE

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@lacroixthomas
Copy link
Copy Markdown
Collaborator Author

/gcbrun

Will double check the error, ive seen some timeout / cert issue maybe ? (On my phone, will look deeper once on my laptop)

@agones-bot
Copy link
Copy Markdown
Collaborator

Build Failed 😭

Build Id: b1cf7ec4-413f-46d7-a17c-80ede018d6fe

Status: FAILURE

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@lacroixthomas
Copy link
Copy Markdown
Collaborator Author

lacroixthomas commented Feb 5, 2026

/gcbrun

            	Test:       	TestControllerCreateGameServerPod/forbidden_pods_creation
            	Error:      	Max difference between 2026-02-01 00:21:32.000059604 +0000 UTC m=+12.717592485 and 2026-02-01 00:21:31 +0000 UTC allowed is 1s, but difference was 1.000059604s
            	Error Trace:	/go/src/agones.dev/agones/pkg/gameservers/controller_test.go:1437

Sadness, so close... 😄 0.000059604s too late unlucky

@agones-bot
Copy link
Copy Markdown
Collaborator

Build Failed 😭

Build Id: 36f881e0-b229-4020-b7bc-5b1ad08c6ba9

Status: FAILURE

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@lacroixthomas
Copy link
Copy Markdown
Collaborator Author

Only failing for gke-autopilot-e2e-test-cluster-1-34 : TestAllocatorCrossNamespace ERROR rerun aborted because previous run had a suspected panic and some test may not have run
/gcbrun

@agones-bot
Copy link
Copy Markdown
Collaborator

Build Succeeded 🥳

Build Id: 6a02a636-4478-4082-a463-0d1b5b5b8e10

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

git fetch https://github.com/googleforgames/agones.git pull/4415/head:pr_4415 && git checkout pr_4415
helm install agones ./install/helm/agones --namespace agones-system --set agones.image.registry=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.56.0-dev-3450bb4

@agones-bot
Copy link
Copy Markdown
Collaborator

Build Succeeded 🥳

Build Id: 1925ccbd-2e0d-4d5e-85fa-cf6b4faf14e4

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

git fetch https://github.com/googleforgames/agones.git pull/4415/head:pr_4415 && git checkout pr_4415
helm install agones ./install/helm/agones --namespace agones-system --set agones.image.registry=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.56.0-dev-ab9e599

@Reasonably
Copy link
Copy Markdown

Are there any other actions I need to take besides approving with LGTM?

@markmandel
Copy link
Copy Markdown
Member

I'm at GDC but lemme know if this is ready for review 👍🏻

@Reasonably but would definitely like your LGTM before I review 👍🏻

@markmandel
Copy link
Copy Markdown
Member

@Reasonably just checking if you had a chance to review - would definitely save me some time 😄

@Reasonably
Copy link
Copy Markdown

@Reasonably just checking if you had a chance to review - would definitely save me some time 😄

LGTM 😄


reqCtx, reqCancel := context.WithTimeout(context.Background(), time.Millisecond)
reqCancel()
j1 := request{gsa: gsa.DeepCopy(), response: make(chan response), ctx: reqCtx}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
j1 := request{gsa: gsa.DeepCopy(), response: make(chan response), ctx: reqCtx}
j1 := request{gsa: gsa.DeepCopy(), response: make(chan response, 1), ctx: reqCtx}

Since the implementation uses a buffered channel, we should here too.

Since tryRespondWithError does a non-blocking send. If the ListenAndAllocate goroutine happens to process the request before the main goroutine reaches res1 := <-j1.response, the non-blocking send sees no reader, goes to default:, and the test hangs forever on the receive.

In practicethe goroutine scheduling probably saves you here, but it's a race. It should be make(chan response, 1) to be consistent with the actual allocate() implementation and to make the test deterministically safe.

Otherwise, LGTM!

@agones-bot
Copy link
Copy Markdown
Collaborator

Build Failed 😭

Build Id: f33d428c-e828-4227-8cbe-b2097de54b7e

Status: FAILURE

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@markmandel
Copy link
Copy Markdown
Member

/gcbrun

@agones-bot
Copy link
Copy Markdown
Collaborator

Build Failed 😭

Build Id: 3e416d44-8240-47d0-b664-cc0623c4565a

Status: FAILURE

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@lacroixthomas lacroixthomas force-pushed the bugfixes/propagate-request-context-in-allocation branch 2 times, most recently from b37f934 to 815b0ab Compare March 26, 2026 07:33
fix: check for context cancelled around allocation

fix: add context on unit tests

fix: add unit tests

fix: ensure chan error are not blocking on context cancelled

fix: ensure there is no race condition on unit test
Signed-off-by: Thomas Lacroix <thomas.lacroix@epitech.eu>
@lacroixthomas lacroixthomas force-pushed the bugfixes/propagate-request-context-in-allocation branch from 815b0ab to 3dc9dbc Compare March 26, 2026 07:41
@agones-bot
Copy link
Copy Markdown
Collaborator

Build Failed 😭

Build Id: 04563415-a8d6-4a83-9a70-f7e90bb32342

Status: FAILURE

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@agones-bot
Copy link
Copy Markdown
Collaborator

Build Succeeded 🥳

Build Id: b7888990-042a-4807-9b81-76ff7e52dc56

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

git fetch https://github.com/googleforgames/agones.git pull/4415/head:pr_4415 && git checkout pr_4415
helm install agones ./install/helm/agones --namespace agones-system --set agones.image.registry=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.57.0-dev-dfaf108


// line up 3 in a batch
j1 := request{gsa: gsa.DeepCopy(), response: make(chan response)}
j1 := request{gsa: gsa.DeepCopy(), response: make(chan response), ctx: context.Background()}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw there were changes, but shouldn't these all these channel creations in this test be make(chan response, 1) to match what happens in the implementing code?

Suggested change
j1 := request{gsa: gsa.DeepCopy(), response: make(chan response), ctx: context.Background()}
j1 := request{gsa: gsa.DeepCopy(), response: make(chan response, 1), ctx: context.Background()}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I thought i updated them at the same time, might have dis something wrong with my rebase/squash, will update them as well

Signed-off-by: Thomas Lacroix <thomas.lacroix@epitech.eu>
@agones-bot
Copy link
Copy Markdown
Collaborator

Build Succeeded 🥳

Build Id: d06cf03e-5b8d-43b4-b8e9-3c6bbc0e2d3f

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

git fetch https://github.com/googleforgames/agones.git pull/4415/head:pr_4415 && git checkout pr_4415
helm install agones ./install/helm/agones --namespace agones-system --set agones.image.registry=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.57.0-dev-22edb6b

@markmandel markmandel enabled auto-merge (squash) March 26, 2026 17:05
@agones-bot
Copy link
Copy Markdown
Collaborator

Build Failed 😭

Build Id: 0c6c31f2-ac56-450e-bcaa-61580281b27c

Status: FAILURE

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@markmandel
Copy link
Copy Markdown
Member

/gcbrun

@agones-bot
Copy link
Copy Markdown
Collaborator

Build Failed 😭

Build Id: 929dee51-4359-4af6-9fd0-a39c3f72f37c

Status: FAILURE

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@markmandel
Copy link
Copy Markdown
Member

/gcbrun

This is all autopilot flakiness

@agones-bot
Copy link
Copy Markdown
Collaborator

Build Succeeded 🥳

Build Id: 96a11a0b-74ff-44ac-8bda-28a9a21849a4

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

git fetch https://github.com/googleforgames/agones.git pull/4415/head:pr_4415 && git checkout pr_4415
helm install agones ./install/helm/agones --namespace agones-system --set agones.image.registry=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.57.0-dev-fa4a285

@markmandel markmandel merged commit 92571cb into agones-dev:main Mar 26, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Request context not propagated to allocator in allocation service

4 participants