Skip to content

Conversation

@krakow10
Copy link
Contributor

@krakow10 krakow10 commented Dec 3, 2025

This has always bothered me, maybe I am good enough at Rust now to tackle this.

  • Arc::from(&[u8]) allocates a new copy of the slice, which is exactly what we want.
  • SharedString::new takes impl AsRef<[u8]> for easy migration of previous callers.
  • Option was removed. I can see that it was there for the drop impl, but I hope that my reasoning that it can be removed is sound.
  • Prior art for Arc race conditions: Memory leak in SharedString #350
  • There are additional intermediate allocation optimizations that can be made in conjunction with Zero-Copy ReadSlice Trait #530: remove the .to_owned() after .read_binary_string()? when constructing a SharedString or NetAssetRef

I wish I knew how to test for memory leaks!

@krakow10 krakow10 marked this pull request as draft December 3, 2025 06:15
@krakow10
Copy link
Contributor Author

krakow10 commented Dec 3, 2025

c3cc3fb is unsound. The standard library needs something like Arc::into_inner that works for unsized inner types.

@Dekkonot
Copy link
Member

Dekkonot commented Dec 3, 2025

I wish I knew how to test for memory leaks!

The best way from my understanding is Miri.

@krakow10
Copy link
Contributor Author

krakow10 commented Dec 3, 2025

This is unsound as noted in c3cc3fb. The standard library needs something like Arc::into_inner that works for unsized inner types.

rust-lang/libs-team#608
Arc::into_unique(Self) -> Option<UniqueArc> looks like it would work here. It has been proposed and accepted, but not implemented yet. It also depends on the unstable feature unique_rc_arc.

Edit:
Looking at the table in the into_unique thread:

Status &mut Arc<T> -> &mut T Arc<T> -> UniqueArc<T>
fail if other strong or weak get_mut ?
fail if other strong, disassociate weak ? proposed into_unique IIUC
clone if other strong, disassociate weak make_mut ?

What we need is the "fail if other strong, disassociate weak" case, and the left or right columns will both work. I think we should prefer the left column because we have an &mut Arc<T> and not an Arc<T>. That function signature would be something like Arc<T>::to_mut(&mut Self) -> Option<&mut T>. We actually don't care about the "dissociate weak" part, all we want is a special drop function that tells us if it's the last strong reference. So the perfect function would be Arc::drop_arc_and_tell_me_if_it_was_the_last_one(Self) -> bool, but I hardly think that belongs in the standard library.

@krakow10 krakow10 marked this pull request as ready for review December 4, 2025 00:05
@krakow10
Copy link
Contributor Author

krakow10 commented Dec 4, 2025

I've come up with an entirely new strategy. After reading the Arc source code for a while, I've come to understand that once the strong count hits 0, Weak references cannot be upgraded anymore and are effectively locked out. I devised code that drops the arc as early as possible, and then counts on that to plug all the parallelism holes. I hope my reasoning in the comments is correct.

Non-Deduplicated SharedString Bug

I added new logic to prevent removing the wrong Weak reference from the string cache. The bug has to do with what happens when the drop thread tries to acquire the string cache lock while it is held by SharedString::new in the Weak::upgrade's None path. I believe master branch can also suffer from this with a very precise and unlikely order of operations:

if Arc::into_inner(self.data.take().unwrap()).is_some() {
let mut cache = match STRING_CACHE.lock() {

  1. The last SharedString of "asdf" is being dropped. The if statement evaluates to true, but the string cache lock is not yet acquired.
  2. Another thread calls SharedString::new("asdf") and acquires the string cache lock. This thread cannot upgrade the existing weak ref, so it enters the None path, creates a new Arc, and inserts a weak ref.
  3. The Drop impl acquires the string cache lock and deletes the weak reference that was just inserted by the other thread.
  4. The next call to SharedString::new("asdf") creates a duplicate string because there is no string cache entry.

Performance

I tested on my laptop and there is no discernible performance impact.

Miri

I installed miri and tried cargo miri test but it errors out because of some isolation shenanigans. I will need to learn more about miri if I want to use it for memory profiling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants