Refactor linear memory to use Atomics #301

wucke13 · 2026-01-02T15:53:59Z

Pull Request Overview

After reading an insightful set of comments from @hoshinolina (again: thank you!) on Lobste.rs I got convinced to rewrite the linear memory using AtomicU8 instead of UnsafeCell<u8>.

TODO or Help Wanted

~~It doesn't work yet 😆.~~

There must be a subtle (to me) bug in the rewrite, causing one of our internal tests (memory_init_test_4) and a handful of the memory_copy.wast and memory_init.wast tests to fail. I don't have the patience to debug it today.

Edit 1:

~~TESTSUITE_SAVE=1 ALLOW_TEST_PATTERN=memory_copy.wast cargo test --test wasm_spec_testsuite -- --nocapture | grep ❌ reveals the specific test statements that fail for mem.copy.~~

Edit 2:

all affected functions invoke mem.copy, so likely the issue is in that? TESTSUITE_SAVE=1 ALLOW_TEST_PATTERN=memory_(init|copy).wast cargo test --test wasm_spec_testsuite -- --nocapture | grep ❌ to get them all.

Edit 3:

The problem occurred for mem.copy within the same memory if source and destination overlap while the source index is smaller than the destination index. In this case, the copy would overwrite source values before they were read at all, causing havoc. The simple fix: for this specific case, do the copy in reverse order.

Checks

Using Nix
- Ran nix fmt
- Ran nix flake check '.?submodules=1'
Using Rust tooling
- Ran cargo fmt
- Ran cargo test
- Ran cargo check
- Ran cargo build
- Ran cargo doc

Benchmark Results

This does in fact negatively affect performance. Especially on the memory load/store hungry fibonacci_loop benchmark, we see a moderate increase of ~11 % in runtime.

group                          benchmark-current.baseline             benchmark-main.baseline
-----                          --------------------------             -----------------------
fibonacci_loop/our/1           1.07   178.5±14.07ns  5.3 MElem/sec    1.00    167.3±2.08ns  5.7 MElem/sec
fibonacci_loop/our/2           1.06   234.8±17.79ns  8.1 MElem/sec    1.00    221.1±2.60ns  8.6 MElem/sec
fibonacci_loop/our/4           1.06   359.5±16.87ns 10.6 MElem/sec    1.00    338.9±2.11ns 11.3 MElem/sec
fibonacci_loop/our/8           1.08   599.0±31.91ns 12.7 MElem/sec    1.00   554.6±69.96ns 13.8 MElem/sec
fibonacci_loop/our/16          1.13  1079.8±19.48ns 14.1 MElem/sec    1.00   959.6±16.83ns 15.9 MElem/sec
fibonacci_loop/our/32          1.12  1995.7±54.10ns 15.3 MElem/sec    1.00  1774.3±27.57ns 17.2 MElem/sec
fibonacci_loop/our/64          1.12      3.8±0.09µs 15.9 MElem/sec    1.00      3.4±0.06µs 17.8 MElem/sec
fibonacci_loop/our/128         1.10      7.5±0.19µs 16.2 MElem/sec    1.00      6.8±0.03µs 17.9 MElem/sec
fibonacci_loop/our/256         1.06     15.3±0.28µs 16.0 MElem/sec    1.00     14.4±1.12µs 17.0 MElem/sec
fibonacci_loop/our/512         1.12     29.7±0.63µs 16.5 MElem/sec    1.00     26.5±0.44µs 18.4 MElem/sec
fibonacci_loop/our/1024        1.01     60.3±1.48µs 16.2 MElem/sec    1.00     59.5±1.21µs 16.4 MElem/sec
fibonacci_loop/our/2048        1.00    119.1±2.63µs 16.4 MElem/sec    1.02    121.4±2.08µs 16.1 MElem/sec
fibonacci_loop/our/4096        1.10   238.0±13.36µs 16.4 MElem/sec    1.00   215.5±10.87µs 18.1 MElem/sec
fibonacci_loop/our/8192        1.14   480.9±39.01µs 16.2 MElem/sec    1.00    420.4±6.57µs 18.6 MElem/sec
fibonacci_loop/our/16384       1.14   950.9±70.02µs 16.4 MElem/sec    1.00   836.4±11.83µs 18.7 MElem/sec
fibonacci_loop/our/32768       1.13  1903.9±27.69µs 16.4 MElem/sec    1.00  1690.7±29.00µs 18.5 MElem/sec
fibonacci_loop/our/65536       1.12      3.8±0.34ms 16.3 MElem/sec    1.00      3.4±0.01ms 18.3 MElem/sec
fibonacci_loop/our/131072      1.10      7.6±0.43ms 16.5 MElem/sec    1.00      6.9±0.38ms 18.1 MElem/sec
fibonacci_loop/our/262144      1.12     15.2±0.79ms 16.5 MElem/sec    1.00     13.6±0.97ms 18.4 MElem/sec
fibonacci_loop/our/524288      1.14     30.7±1.17ms 16.3 MElem/sec    1.00     27.0±1.03ms 18.5 MElem/sec
fibonacci_loop/our/1048576     1.10     60.1±1.45ms 16.6 MElem/sec    1.00     54.6±1.12ms 18.3 MElem/sec
fibonacci_recursive/our/1      1.01    206.9±3.08ns  4.6 MElem/sec    1.00    203.9±2.42ns  4.7 MElem/sec
fibonacci_recursive/our/2      1.00    288.0±4.35ns  6.6 MElem/sec    1.09   313.2±18.09ns  6.1 MElem/sec
fibonacci_recursive/our/4      1.00    484.6±3.07ns  7.9 MElem/sec    1.00    483.2±5.84ns  7.9 MElem/sec
fibonacci_recursive/our/8      1.00   769.4±14.46ns  9.9 MElem/sec    1.01   779.2±16.32ns  9.8 MElem/sec
fibonacci_recursive/our/16     1.05  1425.9±24.97ns 10.7 MElem/sec    1.00  1352.5±14.86ns 11.3 MElem/sec
fibonacci_recursive/our/32     1.02      2.5±0.04µs 12.0 MElem/sec    1.00      2.5±0.12µs 12.3 MElem/sec
fibonacci_recursive/our/64     1.03      4.6±0.06µs 13.2 MElem/sec    1.00      4.5±0.05µs 13.6 MElem/sec
fibonacci_recursive/our/128    1.04      8.9±0.17µs 13.7 MElem/sec    1.00      8.6±0.08µs 14.3 MElem/sec
fibonacci_recursive/our/256    1.01     16.7±0.25µs 14.6 MElem/sec    1.00     16.5±0.25µs 14.8 MElem/sec
fibonacci_recursive/our/512    1.03     34.0±3.88µs 14.3 MElem/sec    1.00     33.0±1.29µs 14.8 MElem/sec

Github Issue

This approach presents a path towards solving #162 .

wucke13 · 2026-01-02T18:11:15Z

Copying takes place as if the bytes were copied from src to a temporary array and then copied from the array to dst.

Found the error. For overlapping source and destination, the order of copy must be so that no byte of src in is overwritten before it was read.

florianhartung · 2026-01-02T18:26:38Z

I think we can even take advantage of AtomicU8::get_mut_slice for host accesses. Although this requires Nighly Rust as of now :/

codecov · 2026-01-05T15:33:42Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

Files with missing lines	Coverage Δ
src/execution/store/linear_memory.rs	`97.52% <100.00%> (+0.39%)`	⬆️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

src/execution/store/linear_memory.rs

After an enlightening discussion with Asahi Lina[1], I was left convinced that using atomic operations to implement non-atomic Wasm instructions might be a good idea after all. This commit is the manifestation of this realization. [1] https://lobste.rs/s/cwdone/why_are_we_worried_about_memory_access#c_obd8av Co-authored-by: Florian <[email protected]> Signed-off-by: Wanja Zaeske <[email protected]>

wucke13 force-pushed the dev/wucke13/refactor-lin-memory branch from 870f4c5 to d374b17 Compare January 5, 2026 15:32

wucke13 force-pushed the dev/wucke13/refactor-lin-memory branch 2 times, most recently from 3e1c1b8 to 7d07198 Compare January 7, 2026 15:02

cemonem previously approved these changes Jan 8, 2026

View reviewed changes

cemonem reviewed Jan 8, 2026

View reviewed changes

src/execution/store/linear_memory.rs Outdated Show resolved Hide resolved

florianhartung reviewed Jan 9, 2026

View reviewed changes

src/execution/store/linear_memory.rs Outdated Show resolved Hide resolved

florianhartung reviewed Jan 9, 2026

View reviewed changes

src/execution/store/linear_memory.rs Outdated Show resolved Hide resolved

wucke13 dismissed cemonem’s stale review via 5cb9f12 January 9, 2026 10:46

wucke13 force-pushed the dev/wucke13/refactor-lin-memory branch 3 times, most recently from d652567 to 4c25ad1 Compare January 9, 2026 14:13

cemonem approved these changes Jan 9, 2026

View reviewed changes

wucke13 force-pushed the dev/wucke13/refactor-lin-memory branch from 4c25ad1 to 9b8e96e Compare January 9, 2026 14:21

florianhartung approved these changes Jan 9, 2026

View reviewed changes

wucke13 added this pull request to the merge queue Jan 9, 2026

Merged via the queue into main with commit 2af195b Jan 9, 2026
13 checks passed

wucke13 deleted the dev/wucke13/refactor-lin-memory branch January 9, 2026 14:51

florianhartung mentioned this pull request Jan 9, 2026

Soundness of LinearMemory: The issue with concurrent write access #162

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor linear memory to use Atomics #301

Refactor linear memory to use Atomics #301

Uh oh!

wucke13 commented Jan 2, 2026 •

edited

Loading

Uh oh!

wucke13 commented Jan 2, 2026

Uh oh!

florianhartung commented Jan 2, 2026

Uh oh!

codecov bot commented Jan 5, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Refactor linear memory to use Atomics #301

Refactor linear memory to use Atomics #301

Uh oh!

Conversation

wucke13 commented Jan 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Overview

TODO or Help Wanted

Checks

Benchmark Results

Github Issue

Uh oh!

wucke13 commented Jan 2, 2026

Uh oh!

florianhartung commented Jan 2, 2026

Uh oh!

codecov bot commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

wucke13 commented Jan 2, 2026 •

edited

Loading

codecov bot commented Jan 5, 2026 •

edited

Loading