Skip to content

Conversation

mergify[bot]
Copy link

@mergify mergify bot commented Oct 6, 2025

A quorum queue tries to repair its record in a tick handler. This can happen during a network partition and the metadata store may itself be unavailable, making the update likely to time out.

The default metadata store timeout is usually higher than the tick interval, so the tick handler may be stuck during several ticks. The record takes some time to be updated (timeout + tick interval, 30 + 5 seconds by default), significantly longer than it takes the metadata store to trigger an election and recover.

Client applications may rely on the quorum queue topology to connect to an appropriate node, so making the system reflect the actual topology faster is important to them.

This commit makes the record update operations use a timeout 1-second lower than the tick interval. The tick handler process should finish earlier in case of metadata datastore unavailability and it should not take more than a couple of ticks once the datastore is available to update the record.


This is an automatic backport of pull request #14672 done by Mergify.

A quorum queue tries to repair its record in a tick handler. This can
happen during a network partition and the metadata store may itself be
unavailable, making the update likely to time out.

The default metadata store timeout is usually higher than the tick
interval, so the tick handler may be stuck during several ticks. The
record takes some time to be updated (timeout + tick interval, 30 + 5
seconds by default), significantly longer than it takes the metadata
store to trigger an election and recover.

Client applications may rely on the quorum queue topology to connect to
an appropriate node, so making the system reflect the actual topology
faster is important to them.

This commit makes the record update operations use a timeout 1-second
lower than the tick interval. The tick handler process should finish
earlier in case of metadata datastore unavailability and it should not
take more than a couple of ticks once the datastore is available to
update the record.

(cherry picked from commit 8387d73)
@acogoluegnes acogoluegnes added this to the 4.2.0 milestone Oct 6, 2025
@acogoluegnes acogoluegnes merged commit 4100450 into v4.2.x Oct 6, 2025
291 checks passed
@acogoluegnes acogoluegnes deleted the mergify/bp/v4.2.x/pr-14672 branch October 6, 2025 10:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant