Issue #22: Fix concurrent bulk generation #23

anubhav-pandey1 · 2024-02-26T16:39:26Z

This PR attempts to fix issue #22.

The root cause of the issue with concurrent bulk generation of Snowflake IDs resulting in duplicate IDs seems to lie in how the sequence variable is being managed within the Snowflake structure in a Rust environment. I think the problem arises due to the lack of synchronisation mechanisms around the access and update of shared state—in this case, the sequence and last_timestamp fields of the Snowflake struct—when accessed by multiple threads.

Why Does This Happen?
In a concurrent environment, multiple threads might call the get_unique_id method on the same Snowflake instance at the same microsecond. Since the current implementation does not include any form of locking or synchronisation, there's a race condition on the sequence field: multiple threads read the same last_timestamp, see that it hasn't changed, and then concurrently attempt to increment the sequence. However, without proper synchronisation, they might not see each other's updates, resulting in the same sequence value being used for multiple IDs.

To fix this, we need to introduce thread-safety into the ID generation process to ensure that concurrent accesses to the sequence and last_timestamp fields are correctly synchronised. In Rust, this can be achieved using synchronisation primitives from the std::sync module with Mutex or Atomic types. Given that the performance of the ID generation is critical and must be high-throughput, using atomic operations is preferable because they incur less overhead than a mutex lock.

The root cause of the issue with concurrent bulk generation of Snowflake IDs resulting in duplicate IDs lies in how the sequence variable is being managed within the Snowflake structure in a Rust environment. I think the problem arises due to the lack of synchronisation mechanisms around the access and update of shared state—in this case, the sequence and last_timestamp fields of the Snowflake struct—when accessed by multiple threads. Why Does This Happen? In a concurrent environment, multiple threads might call the get_unique_id method on the same Snowflake instance at the same microsecond. Since the current implementation does not include any form of locking or synchronisation, there's a race condition on the sequence field: multiple threads read the same last_timestamp, see that it hasn't changed, and then concurrently attempt to increment the sequence. However, without proper synchronisation, they might not see each other's updates, resulting in the same sequence value being used for multiple IDs. To fix this, we need to introduce thread-safety into the ID generation process to ensure that concurrent accesses to the sequence and last_timestamp fields are correctly synchronised. In Rust, this can be achieved using synchronisation primitives from the std::sync module with Mutex or Atomic types. Given that the performance of the ID generation is critical and must be high-throughput, using atomic operations is preferable because they incur less overhead than a mutex lock.

…tion Fix concurrent bulk generation issues

anubhav-pandey1 · 2024-02-26T16:41:37Z

@tangledbytes Please take a look at this PR which might fix issue #22. Please feel free to use the code and mould it to suit your coding standards and style.

anubhav-pandey1 added 2 commits February 26, 2024 22:06

Merge pull request #1 from anubhav-pandey1/fix/concurrent-bulk-genera…

9405c3a

…tion Fix concurrent bulk generation issues

anubhav-pandey1 mentioned this pull request Feb 27, 2024

Concurrent bulk generation of Snowflake IDs from same machine is failing #22

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue #22: Fix concurrent bulk generation #23

Issue #22: Fix concurrent bulk generation #23

anubhav-pandey1 commented Feb 26, 2024

anubhav-pandey1 commented Feb 26, 2024

Issue #22: Fix concurrent bulk generation #23

Are you sure you want to change the base?

Issue #22: Fix concurrent bulk generation #23

Conversation

anubhav-pandey1 commented Feb 26, 2024

anubhav-pandey1 commented Feb 26, 2024