Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rebis-dev: Make AtomTable fully safe and prepare for garbage collection #2736

Draft
wants to merge 20 commits into
base: rebis-dev
Choose a base branch
from

Conversation

adri326
Copy link
Contributor

@adri326 adri326 commented Dec 31, 2024

This is a followup to #2727 that gives AtomTable the ability to verify that Atoms are safe to dereference and the ability to shuffle their layout for garbage collection purposes, by adding another layer of indirection.


Prior to this change, Atoms would contain the offset of their AtomHeader within a buffer. This lets us resize the buffer to add new atoms, but it is prone to issues caused by atoms with invalid offsets and it doesn't let us defragment the AtomTable once garbage collection is introduced.

                 |   buffer   |
                 +------------+
                 |    ...     |
Atom(0x12580) -> | AtomHeader | @12580
                 | "Hello wo" | @12588
                 | "rld!"____ | @12590
                 |    ...     |

With this change, the Atoms now store an index into an array of indices, meaning that verifying the validity of an atom is as simple as checking that it is within this array:

            | offsets |    |   buffer   |
            +---------+    +------------+
            |   ...   |    |    ...     |
Atom(13) -> | 0x12580 | -> | AtomHeader | @12580
            |   ...   |    | "Hello wo" | @12588
            |   ...   |    | "rld!"____ | @12590
            |   ...   |    |    ...     |

This also lets us safely change the order of atoms within the buffer, by changing the offsets, without needing to modify the existing Atoms, which paves the way towards garbage collection within AtomTable.

One important detail with my implementation is that the offsets array is stored within an Arcu, which means that the read from the buffer in Atom::as_ptr now depends on atom_table.inner.offsets.read(), which is an acquire atomic operation. When a new atom is added to the table with AtomTable::build_with, the write to the buffer is sequenced before atom_table.inner.offsets.replace(new_offsets): Atom::as_ptr is now guaranteed (by my limited experience with atomics) to observe the changes as expected.

Instances where the lack of this property cause actual issues are exceedingly rare. A thread would need to somehow have access to an atom offset created from another thread, without synchronization, and immediately try to read its data.

The Sync-ness of AtomTable is now proved, making it (hopefully) safe :)


Please don't be scared by the number of commits and the number of lines changed, they will reduce once #2727 gets merged.

@triska
Copy link
Contributor

triska commented Jan 5, 2025

Thank you a lot, very interesting!

One conceptual question I have about this: What exactly is it that makes for example defragmentation possible in this representation, but not the other? It seems that if one is able to do it in one of them, then also on the other (possibly by temporarily building such an index, though with the benefit of avoiding the indirection every time an atom is referenced).

@adri326
Copy link
Contributor Author

adri326 commented Jan 5, 2025

It's a compromise between performance and safety, mainly. Defragmentation was already possible beforehand, but it required going through the heap to modify any affected Atom, and ensuring that any operation that could lead to garbage collection of the AtomTable doesn't accidentally invalidate a copy of an Atom on the stack.

The current issue with AtomTable is that if someone has the ability to craft their own Atom, then they can trigger UB as follows:

  1. First, allocate a regular atom which contains the bitpattern of an AtomHeader
  2. Then, craft an Atom with as offset the start of the crafted AtomHeader
  3. Reading from that Atom can now read uninitialized memory or cause a data race if another thread allocates a new atom in the meantime

I have thought about multiple ways to ensure that that doesn't happen, which fall into two categories:

  • Make step 2 impossible, by validating the offset, so that the rest of the code can rely on invariants of AtomHeader. I don't know of any method that is faster than indexing into a contiguous array of offsets, and since we are indexing into an array containing the offset already, then we might as well lean into the added indirection.
  • Make step 3 impossible, by validating the AtomHeader. This would require adding atomics to prevent data races and zero-initializing allocated memory.

I personally believe that choosing to make step 3 impossible will lead to code that is harder to reason about, which, given the fuzzy nature of the human mind, can increase the number of bugs that could sneak through.

There is a performance penalty to going with step 2, of course. I've measured my method to increase the benchmark times by something between 3% and 5%.

@triska
Copy link
Contributor

triska commented Jan 6, 2025

Thank you a lot, this seems very well thought through!

In my opinion, we need unbreakable software, even if it is 10 times slower than other systems. A 5% overhead is completely acceptable.

@adri326 adri326 changed the title Make AtomTable fully safe and prepare for garbage collection rebis-dev: Make AtomTable fully safe and prepare for garbage collection Feb 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants