Improve README with extra info about concurrency #520

cedricpim · 2025-02-21T11:01:54Z

Add two notes on README about conurrency controls and how blocked jobs are unblocked and for how long can they be blocked.

Related to what was discussed in #502

Add two notes on README about conurrency controls and how blocked jobs are unblocked and for how long can they be blocked.

rosa · 2025-02-24T17:52:02Z

README.md

@@ -426,7 +426,7 @@ class MyJob < ApplicationJob

 When a job includes these controls, we'll ensure that, at most, the number of jobs (indicated as `to`) that yield the same `key` will be performed concurrently, and this guarantee will last for `duration` for each job enqueued. Note that there's no guarantee about _the order of execution_, only about jobs being performed at the same time (overlapping).

-The concurrency limits use the concept of semaphores when enqueuing, and work as follows: when a job is enqueued, we check if it specifies concurrency controls. If it does, we check the semaphore for the computed concurrency key. If the semaphore is open, we claim it and we set the job as _ready_. Ready means it can be picked up by workers for execution. When the job finishes executing (be it successfully or unsuccessfully, resulting in a failed execution), we signal the semaphore and try to unblock the next job with the same key, if any. Unblocking the next job doesn't mean running that job right away, but moving it from _blocked_ to _ready_. Since something can happen that prevents the first job from releasing the semaphore and unblocking the next job (for example, someone pulling a plug in the machine where the worker is running), we have the `duration` as a failsafe. Jobs that have been blocked for more than duration are candidates to be released, but only as many of them as the concurrency rules allow, as each one would need to go through the semaphore dance check. This means that the `duration` is not really about the job that's enqueued or being run, it's about the jobs that are blocked waiting.
+The concurrency limits use the concept of semaphores when enqueuing, and work as follows: when a job is enqueued, we check if it specifies concurrency controls. If it does, we check the semaphore for the computed concurrency key. If the semaphore is open, we claim it and we set the job as _ready_. Ready means it can be picked up by workers for execution. When the job finishes executing (be it successfully or unsuccessfully, resulting in a failed execution), we signal the semaphore and try to unblock the next job with the same key, if any. Unblocking the next job doesn't mean running that job right away, but moving it from _blocked_ to _ready_. Since something can happen that prevents the first job from releasing the semaphore and unblocking the next job (for example, someone pulling a plug in the machine where the worker is running), we have the `duration` as a failsafe. Jobs that have been blocked for more than duration are candidates to be released, but only as many of them as the concurrency rules allow, as each one would need to go through the semaphore dance check. This means that the `duration` is not really about the job that's enqueued or being run, it's about the jobs that are blocked waiting. When there are multiple jobs unblocked, it is important to note that, as a job finishes and the next jobs are unblocked, the `duration` timer for the still blocked jobs is reset (this happens indirectly via the expiration time of the semaphore).


I'd rephrase this, as it's not really necessary that there are multiple jobs unblocked (because concurrency limit > 1). The behaviour is always the same.

Instead of:

When there are multiple jobs unblocked, it is important to note that, as a job finishes and the next jobs are unblocked, the duration timer for the still blocked jobs is reset (this happens indirectly via the expiration time of the semaphore).

I'd write this as:

It's important to note that after one or more candidate jobs are unblocked (either because a job finishes or because duration expires and a semaphore is released), the duration timer for the still blocked jobs is reset. This happens indirectly via the expiration time of the semaphore, which is updated.

Much nicer 👍

rosa · 2025-02-24T17:53:14Z

Thanks a lot @cedricpim! I just left a small comment about one of the changes 🙏

cedricpim · 2025-03-18T20:31:21Z

@rosa sorry for the delay on this. Just updated the PR.

rosa · 2025-03-24T09:55:35Z

Nice, thanks a lot!

Improve README with extra info about concurrency

25b58e1

Add two notes on README about conurrency controls and how blocked jobs are unblocked and for how long can they be blocked.

rosa reviewed Feb 24, 2025

View reviewed changes

Update text with suggestion

fc96e8a

rosa merged commit 9161da0 into rails:main Mar 24, 2025
33 of 34 checks passed

cedricpim deleted the add-notes-on-concurrency branch March 24, 2025 21:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve README with extra info about concurrency #520

Improve README with extra info about concurrency #520

Uh oh!

cedricpim commented Feb 21, 2025

Uh oh!

rosa Feb 24, 2025

Uh oh!

cedricpim Mar 18, 2025 •

edited

Loading

Uh oh!

rosa commented Feb 24, 2025

Uh oh!

cedricpim commented Mar 18, 2025

Uh oh!

rosa commented Mar 24, 2025

Uh oh!

Uh oh!

Uh oh!

Improve README with extra info about concurrency #520

Improve README with extra info about concurrency #520

Uh oh!

Conversation

cedricpim commented Feb 21, 2025

Uh oh!

rosa Feb 24, 2025

Choose a reason for hiding this comment

Uh oh!

cedricpim Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rosa commented Feb 24, 2025

Uh oh!

cedricpim commented Mar 18, 2025

Uh oh!

rosa commented Mar 24, 2025

Uh oh!

Uh oh!

Uh oh!

cedricpim Mar 18, 2025 •

edited

Loading