-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spin lock #605
Comments
Will you be able to get by with a pure spinlock for locking? Without putting the thread to sleep after some spinning? |
Likely. All structural mutation operations should be very fast. And we can
allocate before we lock. Allocation is much of the time unless we have a
lightweight pool.
We generally do something more like ROWEX and I can help work through what
that might look like as we get further along, though it offered no benefit
in their testing.
Templating the lock strategy and the key structure would both seem to
increase flexibility.
Bryan
…On Mon, Nov 25, 2024 at 10:47 Laurynas Biveinis ***@***.***> wrote:
Will you be able to get by with a pure spinlock for locking? Without
putting the thread to sleep after some spinning?
—
Reply to this email directly, view it on GitHub
<#605 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AATW7YHERK2EORELU6NIMQD2CNBBPAVCNFSM6AAAAABSJY567GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIOJYGM4DCMJYGM>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
The referenced spinlock implementation does not carry over directly to this setting - and the current implementation already addresses the points there. The read lock operation is a pure read. The write lock is a single CAS attempt after a read, which, if fails, does not retry. So in a way this spinlock is already TTAS, although the correspondence is not 1:1. The memory barriers are as weak as possible too. What needs improving however is the spin loop |
Copying umit.
What is the motivation behind _mm_pause()?
If I recall what you had said before, it puts the thread to sleep to allow
context switching. But why would we be spinning long enough for that to
make sense?
I have not looked at the details of the write lock path. Is the allocation
happening before or after the lock is taken?
The bw-tree originally (I think it was the first paper on this, but maybe
there was other work) let the different threads simply race until the CAS
to install the new node version. So both threads would allocate and both
would gather up the data and produce a consolidated new node. Without
question the looser did work that had to be thrown away and the code
reflected this with status returns which included codes that operations
should be retried and loops to race to outcomes.
Taking a lock ensures that only one thread does the work. So it should be
more efficient in cases where it is likely for a race to occur.
But why park a thread vs reduce the time with the lock held (allocate
before lock) and let the thread spin? The ART nodes are all very small and
consolidation time while holding the lock should be small as well.
Just trying to figure out the rationale here.
Thanks,
Bryan
…On Sat, Dec 7, 2024 at 06:01 Laurynas Biveinis ***@***.***> wrote:
The referenced spinlock implementation does not carry over directly to
this setting - and the current implementation already addresses the points
there. The read lock operation is a pure read. The write lock is a single
CAS attempt after a read, which, if fails, does not retry. So in a way this
spinlock is already TTAS, although the correspondence is not 1:1. The
memory barriers are as weak as possible too.
What needs improving however is the spin loop _mm_pause on very iteration
—
Reply to this email directly, view it on GitHub
<#605 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AATW7YA32WJLMG4IKFOZGYT2EL5UBAVCNFSM6AAAAABSJY567GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMRVGE3TENBXGQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
mm_pause won’t yield the thread! It keeps user thread but just pauses it ~30 clocks (if memory serves). It doesn’t context switch. Without it, there could be excessive cache line contention and hence cause unnecessary memory transfer.
It is really spin locks. If we allow context switch immediately it would be much more expensive.
Umit
…Sent from my iPhone
On Dec 7, 2024, at 5:38 PM, Bryan B. Thompson ***@***.***> wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
Copying umit.
What is the motivation behind _mm_pause()?
If I recall what you had said before, it puts the thread to sleep to allow context switching. But why would we be spinning long enough for that to make sense?
I have not looked at the details of the write lock path. Is the allocation happening before or after the lock is taken?
The bw-tree originally (I think it was the first paper on this, but maybe there was other work) let the different threads simply race until the CAS to install the new node version. So both threads would allocate and both would gather up the data and produce a consolidated new node. Without question the looser did work that had to be thrown away and the code reflected this with status returns which included codes that operations should be retried and loops to race to outcomes.
Taking a lock ensures that only one thread does the work. So it should be more efficient in cases where it is likely for a race to occur.
But why park a thread vs reduce the time with the lock held (allocate before lock) and let the thread spin? The ART nodes are all very small and consolidation time while holding the lock should be small as well.
Just trying to figure out the rationale here.
Thanks,
Bryan
On Sat, Dec 7, 2024 at 06:01 Laurynas Biveinis ***@***.******@***.***>> wrote:
The referenced spinlock implementation does not carry over directly to this setting - and the current implementation already addresses the points there. The read lock operation is a pure read. The write lock is a single CAS attempt after a read, which, if fails, does not retry. So in a way this spinlock is already TTAS, although the correspondence is not 1:1. The memory barriers are as weak as possible too.
What needs improving however is the spin loop _mm_pause on very iteration
—
Reply to this email directly, view it on GitHub<#605 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AATW7YA32WJLMG4IKFOZGYT2EL5UBAVCNFSM6AAAAABSJY567GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMRVGE3TENBXGQ>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Ok. What is the problem that we are trying to solve then? |
See https://rigtorp.se/spinlock/
Umit
…Sent from my iPhone
On Dec 7, 2024, at 5:49 PM, Catalyurek, Umit ***@***.***> wrote:
mm_pause won’t yield the thread! It keeps user thread but just pauses it ~30 clocks (if memory serves). It doesn’t context switch. Without it, there could be excessive cache line contention and hence cause unnecessary memory transfer.
It is really spin locks. If we allow context switch immediately it would be much more expensive.
Umit
Sent from my iPhone
On Dec 7, 2024, at 5:38 PM, Bryan B. Thompson ***@***.***> wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
Copying umit.
What is the motivation behind _mm_pause()?
If I recall what you had said before, it puts the thread to sleep to allow context switching. But why would we be spinning long enough for that to make sense?
I have not looked at the details of the write lock path. Is the allocation happening before or after the lock is taken?
The bw-tree originally (I think it was the first paper on this, but maybe there was other work) let the different threads simply race until the CAS to install the new node version. So both threads would allocate and both would gather up the data and produce a consolidated new node. Without question the looser did work that had to be thrown away and the code reflected this with status returns which included codes that operations should be retried and loops to race to outcomes.
Taking a lock ensures that only one thread does the work. So it should be more efficient in cases where it is likely for a race to occur.
But why park a thread vs reduce the time with the lock held (allocate before lock) and let the thread spin? The ART nodes are all very small and consolidation time while holding the lock should be small as well.
Just trying to figure out the rationale here.
Thanks,
Bryan
On Sat, Dec 7, 2024 at 06:01 Laurynas Biveinis ***@***.******@***.***>> wrote:
The referenced spinlock implementation does not carry over directly to this setting - and the current implementation already addresses the points there. The read lock operation is a pure read. The write lock is a single CAS attempt after a read, which, if fails, does not retry. So in a way this spinlock is already TTAS, although the correspondence is not 1:1. The memory barriers are as weak as possible too.
What needs improving however is the spin loop _mm_pause on very iteration
—
Reply to this email directly, view it on GitHub<#605 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AATW7YA32WJLMG4IKFOZGYT2EL5UBAVCNFSM6AAAAABSJY567GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMRVGE3TENBXGQ>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Re. write lock path, what allocation you are referring to? New tree node allocation? If so, that happens before the lock. The allocated nodes are cached thread-locally in case of restarts to avoid repeated allocs-deallocs. Re. parking threads, with OLC ART it only happens in the read lock path. I'd imagine the design as |
Re. parking threads, with OLC ART it only happens in the read lock path.
^ Do you mean "write lock path"?
…On Sat, Dec 7, 2024 at 7:01 AM Laurynas Biveinis ***@***.***> wrote:
_mm_pause does not put the thread to sleep for the OS scheduler, it
compiles to a CPU instruction PAUSE, which apparently enables CPU to save
memory traffic and power.
Re. write lock path, what allocation you are referring to? New tree node
allocation? If so, that happens before the lock. The allocated nodes are
cached thread-locally in case of restarts to avoid repeated allocs-deallocs.
Re. parking threads, with OLC ART it only happens in the read lock path.
I'd imagine the design as _mm_pause for ~5 iterations (maybe even 1-2
busy wait iterations first), then ~10 iterations with random short sleep,
repeat
—
Reply to this email directly, view it on GitHub
<#605 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AATW7YAJ4LREVAMJB3T3UTT2EMEVJAVCNFSM6AAAAABSJY567GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMRVGE4TONZRGI>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Re. write lock path, what allocation you are referring to? New tree node
allocation? If so, that happens before the lock. The allocated nodes are
cached thread-locally in case of restarts to avoid repeated allocs-deallocs.
^ Yes, that is what I meant. Allocation before you take the write lock.
On Sat, Dec 7, 2024 at 7:23 AM Bryan B. Thompson ***@***.***>
wrote:
… Re. parking threads, with OLC ART it only happens in the read lock path.
^ Do you mean "write lock path"?
On Sat, Dec 7, 2024 at 7:01 AM Laurynas Biveinis ***@***.***>
wrote:
> _mm_pause does not put the thread to sleep for the OS scheduler, it
> compiles to a CPU instruction PAUSE, which apparently enables CPU to save
> memory traffic and power.
>
> Re. write lock path, what allocation you are referring to? New tree node
> allocation? If so, that happens before the lock. The allocated nodes are
> cached thread-locally in case of restarts to avoid repeated allocs-deallocs.
>
> Re. parking threads, with OLC ART it only happens in the read lock path.
>
> I'd imagine the design as _mm_pause for ~5 iterations (maybe even 1-2
> busy wait iterations first), then ~10 iterations with random short sleep,
> repeat
>
> —
> Reply to this email directly, view it on GitHub
> <#605 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AATW7YAJ4LREVAMJB3T3UTT2EMEVJAVCNFSM6AAAAABSJY567GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMRVGE4TONZRGI>
> .
> You are receiving this because you authored the thread.Message ID:
> ***@***.***>
>
|
All algorithms looks like this: reads: read lock (spinning if needed), do things, see if read lock is still valid, if not restart writes: read lock (spinning if needed), do things, try to upgrade read lock to write lock with a single CAS, if failed, restart. So both readers and writers could park threads if spinlock is replaced with a lock |
I keep forgetting that this is not what they call ROWEX. Ok. So the
reader needs to read the version tag when the write lock is not held. It
can spin if the write lock is held. Otherwise it gets the version tag.
This is done in a load acquire I assume against the header of the ART
node?
The write lock sets a bit in that 64-bit word in the ART node header which
is the spin lock. So it has exclusive access. Readers and would-be
writers now spin.
The reader checks the post condition to make sure the version tag has not
been modified asynchronously and restarts if it has been modified.
A write/write conflict is mediated by the exclusive lock.
Is that correct?
…On Sat, Dec 7, 2024 at 7:33 AM Laurynas Biveinis ***@***.***> wrote:
All algorithms looks like this: reads: read lock (spinning if needed), do
things, see if read lock is still valid, if not restart
writes: read lock (spinning if needed), do things, try to upgrade read
lock to write lock with a single CAS, if failed, restart.
So both readers and writers could park threads if spinlock is replaced
with a lock
—
Reply to this email directly, view it on GitHub
<#605 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AATW7YHC7EJE2IMOJNS5FLD2EMIMBAVCNFSM6AAAAABSJY567GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMRVGIYTAMRWGE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
All correct |
I am going to close this one for now. There is now a build time option to suppress the use of _mm_pause(). There might be something else to be done here, but it probably warrants its own issue. |
You mentioned an issue with the use of _mm_pause in the spin lock implementation here. We've been using a variant of this spin lock
// spin_lock code is taken from: https://rigtorp.se/spinlock/
| | // modified to handle ARM
| | struct spin_lock_t {
| | std::atomic lock_ = {0};
| |
| | void lock() noexcept {
| | for (;;) {
| | // Optimistically assume the lock is free on the first try
| | if (!lock_.exchange(true, std::memory_order_acquire)) {
| | return;
| | }
| | // Wait for lock to be released without generating cache misses
| | while (lock_.load(std::memory_order_relaxed)) {
| | cpu_acquiesce();
| | }
| | }
| | }
| |
| | bool try_lock() noexcept {
| | // First do a relaxed load to check if lock is free in order to prevent
| | // unnecessary cache misses if someone does while(!try_lock())
| | return !lock_.load(std::memory_order_relaxed) &&
| | !lock_.exchange(true, std::memory_order_acquire);
| | }
| |
| | void unlock() noexcept {
| | lock_.store(false, std::memory_order_release);
| | }
| | }; // spin_lock_t
The text was updated successfully, but these errors were encountered: