Optimize Cairo 0 execution #2206

JulianGCalderon · 2025-09-23T16:03:40Z

Optimize Cairo 0 execution

Description

This PR includes 2 minor optimizations, mainly targeted to Cairo 0 executions. I benchmarked block 10000 and compared the execution with v2.5.0. All benchmarks were ran in my M4 Macbook Pro.

a8daa0a: Using with_capacity to avoid reallocs when inserting to HashMap, in get_ids_data.
b4c8768: Use insert_all to load contiguous memory cells all at once, instead of one at a time.

Benchmarks

I replayed multiple block ranges with my Macbook M4 Pro:

Mainnet 10000 - 5% improvement
Mainnet 20000 to 20010 - 5% improvement
Mainnet 2000000 to 2000010 - 2% improvement

By changing the compile_hint parameter type to `Arc` instead of `Rc`, we can reuse the constants that are already included in `Program`. This avoids cloning all the constants. With this commit, there is a 9.8% performance increase when replaying mainnet block 10000, compared to 2.5.0.

With this commit, there is a 10.5% performance improvement while executing block 1000, compared to 2.5.0

With this commit, there is a 14.4% improvement while executing mainnet block 10000, compared to 2.5.0

github-actions · 2025-09-23T16:21:32Z

Benchmark Results for unmodified programs 🚀

Command	Mean [s]	Min [s]	Max [s]	Relative
`base big_factorial`	2.135 ± 0.018	2.113	2.168	1.00 ± 0.01
`head big_factorial`	2.133 ± 0.008	2.124	2.152	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base big_fibonacci`	2.058 ± 0.006	2.047	2.066	1.00
`head big_fibonacci`	2.062 ± 0.016	2.044	2.096	1.00 ± 0.01

Command	Mean [s]	Min [s]	Max [s]	Relative
`base blake2s_integration_benchmark`	7.681 ± 0.063	7.612	7.818	1.00
`head blake2s_integration_benchmark`	7.701 ± 0.153	7.596	8.105	1.00 ± 0.02

Command	Mean [s]	Min [s]	Max [s]	Relative
`base compare_arrays_200000`	2.193 ± 0.010	2.179	2.207	1.01 ± 0.01
`head compare_arrays_200000`	2.173 ± 0.009	2.159	2.183	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base dict_integration_benchmark`	1.426 ± 0.006	1.420	1.437	1.00
`head dict_integration_benchmark`	1.431 ± 0.018	1.417	1.480	1.00 ± 0.01

Command	Mean [s]	Min [s]	Max [s]	Relative
`base field_arithmetic_get_square_benchmark`	1.232 ± 0.008	1.220	1.245	1.01 ± 0.01
`head field_arithmetic_get_square_benchmark`	1.223 ± 0.006	1.210	1.231	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base integration_builtins`	7.764 ± 0.034	7.706	7.806	1.00
`head integration_builtins`	7.775 ± 0.020	7.742	7.807	1.00 ± 0.01

Command	Mean [s]	Min [s]	Max [s]	Relative
`base keccak_integration_benchmark`	8.055 ± 0.145	7.922	8.327	1.00 ± 0.04
`head keccak_integration_benchmark`	8.018 ± 0.240	7.895	8.695	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base linear_search`	2.163 ± 0.009	2.145	2.174	1.00
`head linear_search`	2.167 ± 0.038	2.140	2.268	1.00 ± 0.02

Command	Mean [s]	Min [s]	Max [s]	Relative
`base math_cmp_and_pow_integration_benchmark`	1.516 ± 0.006	1.508	1.528	1.00 ± 0.01
`head math_cmp_and_pow_integration_benchmark`	1.514 ± 0.021	1.500	1.572	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base math_integration_benchmark`	1.469 ± 0.009	1.459	1.486	1.01 ± 0.01
`head math_integration_benchmark`	1.459 ± 0.006	1.447	1.467	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base memory_integration_benchmark`	1.224 ± 0.003	1.219	1.229	1.01 ± 0.01
`head memory_integration_benchmark`	1.211 ± 0.006	1.202	1.219	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base operations_with_data_structures_benchmarks`	1.569 ± 0.008	1.562	1.587	1.00
`head operations_with_data_structures_benchmarks`	1.578 ± 0.012	1.568	1.613	1.01 ± 0.01

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`base pedersen`	530.4 ± 3.3	526.9	537.6	1.00
`head pedersen`	531.4 ± 3.1	528.0	536.5	1.00 ± 0.01

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`base poseidon_integration_benchmark`	637.7 ± 7.7	625.5	654.8	1.01 ± 0.01
`head poseidon_integration_benchmark`	629.4 ± 3.9	622.5	637.1	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base secp_integration_benchmark`	1.853 ± 0.013	1.843	1.883	1.01 ± 0.01
`head secp_integration_benchmark`	1.832 ± 0.017	1.814	1.869	1.00

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`base set_integration_benchmark`	633.2 ± 2.1	629.8	637.1	1.04 ± 0.01
`head set_integration_benchmark`	610.4 ± 2.7	604.6	613.4	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base uint256_integration_benchmark`	4.281 ± 0.067	4.243	4.465	1.01 ± 0.02
`head uint256_integration_benchmark`	4.241 ± 0.014	4.225	4.270	1.00

codecov · 2025-09-23T16:28:25Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.66%. Comparing base (065c8f4) to head (4612ef6).

Additional details and impacted files

@@           Coverage Diff           @@
##            2.x.y    #2206   +/-   ##
=======================================
  Coverage   96.66%   96.66%           
=======================================
  Files         103      103           
  Lines       43646    43683   +37     
=======================================
+ Hits        42191    42228   +37     
  Misses       1455     1455

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

DiegoCivi · 2025-09-25T13:32:04Z

vm/src/vm/vm_memory/memory.rs

+        if segment.len() < value_offset + vals.len() {
+            segment.reserve(value_offset + vals.len() - segment.len());
+        }


nit: I think we can remove the if since the documentation of reserve() says:

Does nothing if capacity is already sufficient

I believe they refer to different things:

segment.len() < value_offset + vals.len() checks whether the length of the segment is enough for holding the new elements. If not, it reserves capacity for the additional elements.

the inner function checks whether the capacity of the segment is enough for holding the new elements. The capacity refers to the vector's allocated memory.

The if is required, as we only need to reserve new elements if the segment's length is not enough already. Without the condition, we would be sometimes be calling reserve with a negative argument (which would cause underflow as we are using a usize).

...we would be sometimes be calling reserve with a negative argument.

I thought that value_offset is always higher than segment.len().

Maybe I have the wrong understanding about segments, but if the lenght of a segment is 5. Doesn't it mean it has 5 allocated elements? If that is the case, having an offset lower than the lenght means it would want to write on already used memory which is something that it cannot be done, right?

I thought that value_offset is always higher than segment.len().

I think that in load_data that is usually the case, but I'm not sure it would happen always. Consider the following segment:

[NONE, NONE, NONE, 10, 20]

We may want to call insert_all to insert 3 elements at the start of the segment. In that case, there is no need to reserve more space. Note that having NONE is completely valid in a segment, those are commonly known as "memory gaps".

Perfect, I forgot you could have those. Thanks!

Currently, insert_all is generic and supports the use case of inserting multiple elements at the middle of a segment.

If we make sure that load_data can only insert elements at the end of a segment, we could have another method (i.e. extend_at), only used for when inserting elements at the end of a segment. This could improve performance.

DiegoCivi · 2025-09-25T13:44:38Z

vm/src/vm/vm_memory/memory.rs

+            segment.resize(value_offset, MemoryCell::NONE);
+        }
+        // Insert new elements.
+        let last_element_to_replace = segment.len().min(value_offset + vals.len());


Isn´t the last index always value_offset + vals.len()? I don´t get in which case the segments len would be higher than that.

The behavior of splice is a bit tricky.

It receives two arguments:

The range to replace.

the elements to replace it with.

The length of the range and the length of the replacement does not need to coincide. For example, consider the following array:

[0, 1, 2, 3, 4, 5]

If we want to insert [6,7,8] at index 4, we would be inserting 3 elements, but replacing only 2. The splice call would look like this:

splice(4..6, [6,7,8])

The result would look like this:

[0, 1, 2, 3, 6, 7, 8]

The following, instead, fails with index out of bounds, because we are replacing an element that does not exist.

splice(4..7, [6,7,8])

Ohh I see. Awesome, thanks!

This reverts commit 27d314d.

DiegoCivi

Looks good!

vm/src/vm/vm_memory/memory.rs

JulianGCalderon added 4 commits September 23, 2025 10:26

Using with_capacity to avoid reallocs when inserting to hashmap

a8daa0a

With this commit, there is a 10.5% performance improvement while executing block 1000, compared to 2.5.0

Use insert_all instead of insert

b4c8768

With this commit, there is a 14.4% improvement while executing mainnet block 10000, compared to 2.5.0

Update changelog

fa58734

JulianGCalderon marked this pull request as ready for review September 23, 2025 22:23

DiegoCivi reviewed Sep 25, 2025

View reviewed changes

Revert "Avoid cloning constants when compiling hints"

dbea70a

This reverts commit 27d314d.

DiegoCivi approved these changes Sep 25, 2025

View reviewed changes

FrancoGiachetta approved these changes Sep 26, 2025

View reviewed changes

JulianGCalderon mentioned this pull request Oct 2, 2025

Investigate caching Cairo Runner between runs #2213

Open

FrancoGiachetta requested changes Oct 3, 2025

View reviewed changes

vm/src/vm/vm_memory/memory.rs Show resolved Hide resolved

Validate memory cells

4612ef6

FrancoGiachetta approved these changes Oct 3, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize Cairo 0 execution #2206

Optimize Cairo 0 execution #2206

Uh oh!

JulianGCalderon commented Sep 23, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 23, 2025 •

edited

Loading

Uh oh!

codecov bot commented Sep 23, 2025 •

edited

Loading

Uh oh!

DiegoCivi Sep 25, 2025

Uh oh!

JulianGCalderon Sep 25, 2025

Uh oh!

DiegoCivi Sep 25, 2025

Uh oh!

JulianGCalderon Sep 25, 2025

Uh oh!

DiegoCivi Sep 25, 2025

Uh oh!

JulianGCalderon Sep 25, 2025

Uh oh!

DiegoCivi Sep 25, 2025

Uh oh!

JulianGCalderon Sep 25, 2025

Uh oh!

DiegoCivi Sep 25, 2025

Uh oh!

DiegoCivi left a comment

Uh oh!

Uh oh!

Uh oh!

Optimize Cairo 0 execution #2206

Are you sure you want to change the base?

Optimize Cairo 0 execution #2206

Uh oh!

Conversation

JulianGCalderon commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Optimize Cairo 0 execution

Description

Benchmarks

Uh oh!

github-actions bot commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DiegoCivi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

JulianGCalderon commented Sep 23, 2025 •

edited

Loading

github-actions bot commented Sep 23, 2025 •

edited

Loading

codecov bot commented Sep 23, 2025 •

edited

Loading