Skip to content

Add cancellation support in TransportGetAllocationStatsAction #127371

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

JeremyDahlgren
Copy link
Contributor

@JeremyDahlgren JeremyDahlgren commented Apr 25, 2025

Replaces the use of a SingleResultDeduplicator by refactoring the cache as a subclass of CancellableSingleObjectCache. Refactored the AllocationStatsService and NodeAllocationStatsAndWeightsCalculator to accept the Runnable used to test for cancellation.

Closes #123248

@JeremyDahlgren JeremyDahlgren added >feature :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) Team:Distributed Coordination Meta label for Distributed Coordination team labels Apr 25, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

@elasticsearchmachine
Copy link
Collaborator

Hi @JeremyDahlgren, I've created a changelog YAML for you.

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My feeling is that there has to be a simpler way to achieve what we want than adding all this bare-handed locking etc. I would have expected something based on ref-counting would be neater: each waiting listener should hold a ref, releasing it on cancellation, and an ongoing computation stops early if the number of waiting refs drops to zero.

@JeremyDahlgren
Copy link
Contributor Author

My feeling is that there has to be a simpler way to achieve what we want than adding all this bare-handed locking etc. I would have expected something based on ref-counting would be neater: each waiting listener should hold a ref, releasing it on cancellation, and an ongoing computation stops early if the number of waiting refs drops to zero.

I applied your suggested change to onFailure() the listener right when cancellation is detected in CancellableSingleObjectCache, and refactored TransportGetAllocationStatsAction to use it. The changes are less complex now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >feature Team:Distributed Coordination Meta label for Distributed Coordination team v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TransportGetAllocationStatsAction should be cancellable
3 participants