Mask generator optimizations #15935

moh-eulith · 2025-03-12T16:59:13Z

implements #15870

github-actions · 2025-03-12T16:59:25Z

Thank you for your contribution to the Solidity compiler! A team member will follow up shortly.

If you haven't read our contributing guidelines and our review checklist before, please do it now, this makes the reviewing process and accepting your contribution smoother.

If you have any questions or need our help, feel free to post them in the PR or talk to us directly on the #solidity-dev channel on Matrix.

moh-eulith · 2025-03-12T17:24:51Z

Looking at the build failures...
I've been running ./test/soltest -p, and that passes.
I'll start with the required ones.

ekpyron

Some of the CI failures are due to missing updates on the gas statistics in some tests.
There is an easy way to do that:
You can run
<build>/test/tools/isoltest -t semanticTests/* --optimize --accept-updates
Which should adjust the test expectations (while of course verifying that only gas costs change - you can also skip --accept-updates to manually inspect the changes and adjust case by case).

We're also wondering how best to add a unit tests for this and may get back to you with a suggestion for that.

ekpyron · 2025-03-12T17:15:15Z

Changelog.md

@@ -4,7 +4,7 @@ Language Features:


 Compiler Features:
-
+* Optimized constant generation for masks. With low compiler runs (optimized for size), reduces the byte code length and gas.


Suggested change

* Optimized constant generation for masks. With low compiler runs (optimized for size), reduces the byte code length and gas.

* Constant Optimizer: Compute masks using shifts when optimizing for size.

Feel free to adjust the bit after the colon, but to keep things consistent, we usually categorize by component (see e.g. 0.8.27 for reference)

ekpyron · 2025-03-12T17:27:13Z

Similarly test/cmdlineTests.sh --no-smt --update will update the command line tests (we'll then need to sanity-check the changes).
The t_native_text_ext_* failures seem to be a transient network issue which will hopefully vanish on the next run. (Since you're not modifying the smtsolver, which require Z3 to be installed and are slow to run running with --no-smt should be enough)

ekpyron · 2025-03-12T17:45:16Z

One option for testing would be to borrow from

solidity/test/libevmasm/Optimiser.cpp

Line 1333 in 340ae32

    
           BOOST_AUTO_TEST_CASE(jumpdest_removal_subassemblies, *boost::unit_test::precondition(nonEOF()))

while specifically running the ConstantOptimisationMethod::optimiseConstants on Assembly as constructed there. A few simple cases for masks in the middle bits, the high bits, and the low bits and maybe a non-mask should do it. If there's problems with that let us know!

cameel · 2025-03-12T18:25:51Z

We're also wondering how best to add a unit tests for this and may get back to you with a suggestion for that.

We talked about it a bit and we could generally use a new test case type to test EVM assembly import and assembly-level optimizations. I'm going to work on this and then ping you here when you'll be able to use this.

It will work like --import-asm-json, just in a text format that's not as verbose as JSON.

cameel · 2025-03-12T19:17:15Z

Is it fair to assume auxdata is not part of the gas?

auxdata is basically metadata and is part of the binary. It does affect gas at creation time because it has to be copied to memory along with the rest of the bytecode. We account for the code cost separately, but that's just the cost of storing it in calldata and does not include any operations that may be performed on it by the contract.

ekpyron · 2025-03-12T19:54:01Z

The optimized tests run per default with runs set to 200, since that's the default compiler setting (it's arguable whether that's a good default, but it's been so historically) - with that on the case in question I currently see the masks as constants on develop - without having looked at it more closely my guess would be that the increase in runtime gas cost under that runs value pays off with a decrease in code size (which the test in this case doesn't display).
The gas costs in the tests are meant to give a good estimation of the effect of changes of code, but since we don't generate them for the boundary runs settings, it can be fine if some of them increase if the reason is well understood.

moh-eulith · 2025-03-13T02:03:40Z

My pedestrian solution for EOF and constant optimizer. Happy to include in this PR or another, or just ignore.

moh-eulith@5805f6d

moh-eulith · 2025-03-14T16:22:16Z

I decided messing with the default behavior was not appropriate. Changed the thresholds for the default runs of 200 so that the size optimization doesn't kick in for simple constants. The change is now strictly better (at 200), and will favor size reduction below 200.

cameel · 2025-03-15T01:19:30Z

libevmasm/ConstantOptimiser.cpp

+	//only fully optimize for size if the compiler parameters are not default of 200
+	unsigned threshold1 = m_params.runs < 200 ? 32 : 128;
+	unsigned threshold2 = m_params.runs < 200 ? 16 : 128;


Instead of using arbitrary thresholds to choose between the old and the new method, why not try both and choose the better one based on gasNeeded()?

Note that this is how the constant optimizer generally operates. For example below we run the decomposition in a loop multiple times and choose the best result based on gas. And we select between using a literal, data or code based on gas as well.

This may solve some of the cost regressions you're seeing, because in cases where your method is not better, we'll simply fall back to the old one.

I'm glad you asked that. The original code did exactly as you've suggested: it was using gasNeeded to pick between using a constant or code to compute the constant. The real problem is, gasNeeded is literally a heuristic. This is the formula:

m_params.runs * _runGas + m_params.multiplicity * _repeatedDataGas

both m_params.runs and m_params.multiplicity are arbitrary. What is 200? where did it come from? (I personally never run with that default -- I either go with 2, for size, or 2M for speed; but this PR is not about my preferences).

Given the above heuristic, the new code path was so much better, that with the default of 200, it was getting picked over constant representation. The new code will always produce shorter code, no "regressions" on that, unless you want minimum gas at runtime. The effect is simply to change the switch over point from constant to code based on the runs value. If this was my compiler, I'd go with it, but this compiler is used by a lot of people who may depend on that default behavior based on the arbitrary 200 value and an arbitrary heuristic, so I didn't want to disturb the switch-over point.

Just to emphasize, both values of runGas and dataGas are lower with the new algorithm for the code path. The competition is not between the old and new code representations, but between code representation and a constant.

Well, arbitrariness aside, my main concern is that it introduces another knob to tweak which potentially makes things more complex than making it depend just on the current formula.

both m_params.runs and m_params.multiplicity are arbitrary. What is 200? where did it come from?

It's actually not as arbitrary as it may seem. runs has a pretty straightforward interpretation. It's the number of times you expect the contract to be executed over its lifetime, with a simplifying assumption that each time you hit every opcode exactly once. The optimizer is supposed to give you the bytecode that will minimize gas usage over contract's whole lifetime, accounting for initial deployment and all executions.

200 is the value that could be considered the middle ground between the cost of of deployment and execution. The cost of deployment of each byte of the runtime code is 200 gas and is included as a factor in repeatedDataGas. If you set runs to 200 as well, you can take it out:

200 * (runGas + multiplicity * bytesRequired)

runGas and bytesRequired are in the same ballpark so with this value they will have similar impact on the result.

As for multiplicity, it's the number of times the constant appears in the bytecode. I'm actually not sure it makes sense to have it in there. There's a comment saying that it's not applied to runGas because "the runs are per opcode", but cost is being calculated for all the AssemblyItems pushing the same value at once, not for a single one. Maybe we should remove it.

I personally never run with that default -- I either go with 2, for size, or 2M for speed; but this PR is not about my preferences).

Whether it's a good default is of course a different question. Perhaps we should make it higher. I often hear that deployment cost does not matter to people at all. But that could also be due to the fact that it comes from bigger projects where that cost is a drop in the bucket and what they worry about is minimizing the cost for the users. We would be open to changing it if someone showed some convincing arguments either way.

The effect is simply to change the switch over point from constant to code based on the runs value. If this was my compiler, I'd go with it, but this compiler is used by a lot of people who may depend on that default behavior based on the arbitrary 200 value and an arbitrary heuristic, so I didn't want to disturb the switch-over point.

I don't think this is a concern. With 200 runs you get constants because they provide lower lifetime cost than the code equivalent. If the new code is better in that regard, it will naturally move the switch-over point, but this will benefit users who want a balanced ratio. Those who don't are supposed to choose a higher or lower runs value anyway.

The new code will always produce shorter code, no "regressions" on that, unless you want minimum gas at runtime.

Like @ekpyron said above, the fact that in some cases costs increased with 200 runs does not necessarily disqualify the initial version as long as it can be explained. A better benchmark would probably be whether the change universally improves deployment cost with low runs and execution cost with high runs. At 200 it's understandable to see some improve and some degrade.

Though there's of course also the question of how well the current formula reflects the actual costs. I do have some concerns here:

The assumption that every opcode is hit once per run basically means that your contract is one big function with no branching, early returns or loops. That's not true for most real contracts. Any of these elements will skew the evaluation. For example it will overestimate the execution cost in contracts with many small functions and underestimate it in contracts with many loops.

Like I said above, I have doubts whether multiplicity should even be in the formula. Repeating the same value many times will artificially lower the estimated execution cost.

Data cost is taken as 200 per byte of deployed code, but should really be 216 (268 for older EVMs), because that code is also a part of the calldata of the initial transaction. It's even a bit more because each byte needs to be copied to memory before it is returned from constructor.

It's the number of times you expect the contract to be executed over its lifetime

Have you ever met an engineer who could predict that? Given how contracts and financial incentives are tightly bound, wouldn't it be peculiar if someone knew or wished that a contract would have such a temporary life as only 200 executions?

I often hear that deployment cost does not matter to people at all.

Indeed that is true, but the overall size limit is a huge pain. That is the motivation for me doing this (if you're curious, I have an issue tracking that). The patterns that avoid that (diamond, proxy, etc), come with a 20K gas per call overhead.

because they provide lower lifetime cost than the code equivalent

I think that's where we disagree (not that it really matters, see below*). From my perspective the lifetime computation is so unrealistic that it shouldn't be considered anything other than a heuristic. As you point out, running through every opcode once is not a measure of anything resembling an actual contract call. I love me some loops 😄 (which is probably why I'm hesitant to increase runtime gas costs at the default settings).

whether the change universally improves deployment cost with low runs and execution cost with high runs

With low runs, the new code is always better, because both the new and old code choose code-to-compute-constant, and the new computation takes fewer bytes and consumes less gas. With high runs, the constants are just constants, so the they're exactly equivalent.

Since I don't run the compiler with 200, I don't really care how it behaves with default settings. The decision I leave up to you. Happy to provide more data.

The runs parameter is a rather complex topic - even putting aside the fact that its naming alone has caused quite a bit of confusion (there has been the plan to rename it to `expectedExecutionsPerDeployment``). While it is technically (rather) precisely defined, in practice, it indeed definitely serves more as a heuristic than a precise metric. It would be nice if we could just replace it by "optimize for size" and "optimize for runtime gas" instead, however, unfortunately things are not as simply as that due to, as you say, the overall size limit. As far as I'm aware, the usual use for the parameter is indeed either set it to very low (optimizing for size) or very high (optimizing for runtime gas) - however, it is not uncommon to have to reduce it down from the maximum value to account for the overall code size limit (which then is more of a binary search for fitting the bytecode into code size limits than actually reasoning about the number of executions). The default value is historic and rarely makes sense itself (and as such is also something we should definitely consider changing; but also nothing we need to particularly tweak for here). Ideally, we'd have more targeted and direct settings for accommodating code size limits, and ideally we'd even take into account the control flow structure for it (e.g. emphasizing code size reductions in reverting paths or de-emphasizing it for loops, etc.). However, that's of course not a simple matter and obviously goes beyond the scope of this PR.
For this PR, I'd myself tend towards just letting it use the existing heuristics as they are (so without the additional thresholds) and would consider generally refining the heuristic and compiler settings in this regard a longer term goal that needs further consideration.

I'd myself tend towards just letting it use the existing heuristics

ok, great. Sounds like we have consensus from the core team on that. I've removed the runs parameter dependency.

cameel · 2025-03-15T01:29:23Z

libevmasm/ConstantOptimiser.cpp

@ekpyron Not strictly related to the PR, but while looking at it I noticed that ConstantOptimisationMethod::simpleRunGas() does not account for the new AssemblyItem types we introduced for EOF.

Especially DUPN and SWAPN. Though fortunately it should not have any real consequences because we are not using them in constant optimizer yet (heads up @rodiazet).

cameel · 2025-03-15T01:48:12Z

Not sure what the right way to handle the test cases are. Can I supply a different outcome? Can I exclude the test for EOF only?

In many test types you can use the bytecodeFormat setting to limit the test to non-EOF:

// ====
// bytecodeFormat: legacy

Normally you'd create a separate version with different expectations for EOF, but since the constant optimizer is disabled there, it's fine to just leave a TODO comment as a reminder to do that later.

My pedestrian solution for EOF and constant optimizer. Happy to include in this PR or another, or just ignore.

moh-eulith@5805f6d

If that works then perhaps the constant optimizer does not even need much changes for EOF, but I'd still rather not enable it just as an afterthought in this PR. Even if it compiles and passes current tests, it does not necessarily mean it's not broken in some subtle way. At this point we do have working semantic tests so there's a high chance it's not, but our coverage is not always perfect, especially for things that depend on the runs parameter (most tests just run with 200). #15935 (comment) is a good example. This particular problem won't affect it yet because we're not using any EOF-only instructions there, but there could be similar ones lurking in there so it needs to be given proper attention.

cameel · 2025-03-15T01:54:00Z

Regarding costs, the ones from test cases may not always be representative, because they are rather artificial. We should also check how it affects real projects (i.e. our external tests). You can do that by getting the summarized-benchmarks.json artifact from the c_ext_benchmarks job from your PR and from the base branch and comparing them using benchmark_diff.py. Check --help. It can produce a markdown table that you can just paste into a comment.

moh-eulith · 2025-03-15T02:01:15Z

I'd still rather not enable it just as an afterthought in this PR.

Agreed. The constant optimizer is relatively straight forward to reason about, as it's completely self contained. For the record, here are the instructions used in the 3 different paths:

constant:
PUSHX

compute:
PUSHX, EXP, NOT, MUL, ADD, SUB, SHL, SHR

copy:
CODECOPY, MLOAD, SWAP1, MSTORE, DUP1, DUP2, DUP4, SWAP2

My code essentially disables the third branch. Just leaving here as a note for the next person who needs to deal with it.

moh-eulith · 2025-03-15T03:03:27Z

summarized-benchmarks.json artifact from the c_ext_benchmarks job from your PR and from the base branch

Thanks for the pointer.
This PR as originally written (no 200 heuristic) vs. develop:

`ir-optimize-evm+yul`

project	bytecode_size	deployment_gas	method_gas
brink	`-2.09% ✅`
colony	`-2.32% ✅`
elementfi	`-1.47% ✅`
ens	`-2.11% ✅`	`-2.5% ✅`	`-0.06% ✅`
euler	`-2.9% ✅`	`-2.93% ✅`	`-0.34% ✅`
gnosis
gp2	`-1.89% ✅`
pool-together	`-2.29% ✅`
uniswap	`-2.77% ✅`
yield_liquidator	`-3.42% ✅`	`-3.53% ✅`	`-0.05% ✅`
zeppelin	`-2.54% ✅`

This PR as is now (with 200 heuristic) vs develop:

`ir-optimize-evm+yul`

project	bytecode_size	deployment_gas	method_gas
brink	`-2.09% ✅`
colony	`-1.91% ✅`
elementfi	`-1% ✅`
ens	`-1.21% ✅`	`-1.75% ✅`	`-0.05% ✅`
euler	`-1.84% ✅`	`-1.82% ✅`	`-0.25% ✅`
gnosis
gp2	`-1.25% ✅`
pool-together	`-1.62% ✅`
uniswap	`-1.17% ✅`
yield_liquidator	`-2.13% ✅`	`-2.07% ✅`	`-0.07% ✅`
zeppelin	`-1.56% ✅`

I'm not really sure what method_gas is being measured, so it's hard to opine on what that means.
(Edit: now with fancy tables)

cameel · 2025-03-15T05:02:47Z

I'm not really sure what method_gas is being measured, so it's hard to opine on what that means.

This is the total execution cost measured by running project's test suite. Hard to say how representative it is, but at least it comes from actually executing most of project's code.

We should actually be getting method_gas for more projects. I've seen some failures in parsing the coverage report in CI (which is where the values come from). We should look into that. It does not make CI fail, because we intentionally ignore errors in getting benchmarks.

Also, just FYI, what you pasted are the tables in format that's better suited for the terminal (which is why it's the default), but there's a flag that will give you more fancy markdown as well :)

moh-eulith · 2025-03-15T05:51:10Z

We should actually be getting method_gas for more projects.

I double checked and it's null in the json for develop branch. Do you know which ones? I managed to run zeppelin and it looks like it never reports any gas metrics. Some are hard to run locally (need older version of foundry, etc).

cameel · 2025-04-10T21:55:35Z

Just a heads up, we still do want to have the assembly tests here. We generally should have this kind of coverage for the optimizer and it will be useful beyond this PR alone. There's been a lot of distractions with other topics recently, but I'm back to working on this, so it'll be ready soon.

moh-eulith · 2025-04-24T16:44:27Z

I reorganized the commits. The first commit adds tests with no code changes using @cameel 's new testing framework. (I like the flexibility that allows better coverage for different optimization levels.)

Three tests show how the optimizer works at 0, 200 and 2M runs.

The second commit is the code. Looking at the tests changes gives a pretty good indication of how the form changes with runs and how it's more optimal than previously.

moh-eulith · 2025-05-16T19:15:59Z

Could I please get a review in the next few days? I have an upcoming trip and may not be able to give this the attention it may require. Thanks.

cameel

Could I please get a review in the next few days? I have an upcoming trip and may not be able to give this the attention it may require. Thanks.

Yes and sorry for the delays. The last weeks were pretty turbulent for us, with the EOF stuff and our transition to Argot in progress, so unfortunately many things generally did not get as much attention as they needed, including this PR.

But most of that is behind us so you'll get more timely feedback now. I was actually reviewing this just now and the first part is below. I ran out of time today so I'll continue this next week.

Generally, I see that there's a bit of stuff added since I last looked at it and I have remarks on that. Also some stylistic stuff to get out of the way, which I left for the end not to bother you when the code was still in flux.

cameel · 2025-05-16T16:40:49Z

Changelog.md

@@ -15,6 +15,7 @@ Compiler Features:
 * EVM: Set default EVM Version to `prague`.
 * NatSpec: Capture Natspec documentation of `enum` values in the AST.

+* Constant Optimizer: Compute masks using shifts when optimizing for size; use an ``--optimizer-runs`` value less than 200 for maximum size reduction.


0.8.30 has been released, so this needs to be moved to 0.8.31.

cameel · 2025-05-16T19:05:54Z

libevmasm/ConstantOptimiser.cpp

+	// 10^x * y can be encoded as: push x push 10 exp push y mul. That's about 7 bytes more than plain push
+	if (pow10 > 12)
+	{
+		AssemblyItems newRoutine = findRepresentation(divBy10);
+		newRoutine += AssemblyItems{u256(pow10), u256(10), Instruction::EXP, Instruction::MUL};
+		return newRoutine;
+	}


I think that choosing this rule only based on the number of zeros in base-10 representation may sometimes worsen the results that the base-2 decomposition in else already gives us.

Consider for example a number like 10**15 * 2**100, i.e. 0x38d7ea4c680000000000000000000000000000.

The pow10 method will decompose it into PUSH1 0x1 100 SHL PUSH1 10 15 EXP MUL (9 bytes).

On the other hand base-2 decomposition should be able to find PUSH5 0x38d7ea4c68 112 SHL (8 bytes).

TBH, this may not be the best example. I initially thought the difference would be much bigger with divBy10 being pushed as is, but you do call findRepresentation() on it, so that part will go through base-2 decomposition. Still, pure base-2 decomposition has a slight edge in size, because powers of 10 tend to have quite a few zeros in base-2 representation as well. And it does not use EXP.

Just one caveat: I haven't had much time for this today so I didn't really verify that the base-2 rule actually does produce this shorter variant. It just looks like it should.

Overall, not all of the rules you added are mutually exclusive, so I'm thinking that it may be best not to exit early but rather try them all (or at least the ones that overlap) and choose the shortest result.

I'm sorry, I must've misremembered the gas cost for EXP. When I looked it up, my jaw dropped to the floor. Nobody should be paying 60 gas for a few bytes. Removed the 10 logic entirely and this is now back to the original title: mask generation. Just one section that recognizes pure masks and makes them better.

cameel · 2025-05-16T19:21:09Z

libevmasm/ConstantOptimiser.cpp

+	unsigned lowZeros = 0;
+	unsigned highOnes = 0;
+	for (; ((_value >> lowZeros) & 1) == 0 && lowZeros < 256; lowZeros++) {}
+	for (; ((_value >> (lowZeros + highOnes)) & 1) == 1 && highOnes < 256; highOnes++) {}


Some remarks on readability of this.

One thing that would have helped me grasp what's happening much quicker the first time I saw this code would have been an example showing what the "high ones" and "low zeros" are. From the name I actually expected the ones to be the highest bits.

I'd also use while and drop some of the more obvious parentheses to make it a little less dense:

Suggested change

unsigned lowZeros = 0;

unsigned highOnes = 0;

for (; ((_value >> lowZeros) & 1) == 0 && lowZeros < 256; lowZeros++) {}

for (; ((_value >> (lowZeros + highOnes)) & 1) == 1 && highOnes < 256; highOnes++) {}

// high ones low zeros

// |----------||--------------|

// 0x000000000000000000000000000000000000ffffffffffff0000000000000000

unsigned lowZeros = 0;

unsigned highOnes = 0;

while ((_value >> lowZeros) & 1 == 0 && lowZeros < 256)

++lowZeros;

while ((_value >> (lowZeros + highOnes)) & 1 == 1 && highOnes < 256)

++highOnes;

For readability I'd also appreciate some more separation between each of methods you try below (e.g. a clear comment starting it and a blank like after a block with return; or even just making them independent helper methods with nice names).

The parenthesis were necessary (the suggested code does not compile). changed the rest as suggested.

cameel · 2025-05-16T19:21:36Z

libevmasm/ConstantOptimiser.cpp

+	if (m_params.evmVersion.hasBitwiseShifting() && highOnes > 32 &&
+		((_value >> (lowZeros + highOnes)) == 0) &&
+		((lowZeros + highOnes < 256) || lowZeros > 16))


This is how we wrap long conditions:

Suggested change

if (m_params.evmVersion.hasBitwiseShifting() && highOnes > 32 &&

((_value >> (lowZeros + highOnes)) == 0) &&

((lowZeros + highOnes < 256) || lowZeros > 16))

if (

m_params.evmVersion.hasBitwiseShifting() &&

highOnes > 32 &&

(_value >> (lowZeros + highOnes)) == 0 &&

(lowZeros + highOnes < 256 || lowZeros > 16)

)

Same for the other if below.

also added comments on each condition

cameel · 2025-05-16T19:36:35Z

test/libevmasm/evmAssemblyTests/constant_optimizer_runs_0.asm

Thanks for preparing thorough tests, that's exactly what we needed, but they would be even better with a bit more granularity. I'd at least put the cases targeting each of the decomposition methods (high-ones-low-zeros, 10**x, y*10**x; BTW, they should have names so that we can refer to them) in a separate file. I were writing this myself, I'd even put each mask size in a separate test file, though one per method would still be good enough if that's too much of a hassle to refactor it now.

Overall the issue is that the longer it is, the harder to find the right output lines when jumping back and forth between input and output. Also, the current structure will encourage people to keep adding more, making these tests grow even bigger when extending constant optimizer in the future.

I wasn't quite sure what granularity to go for -- one constant per file seemed egregious.
Since there is no 10 section any more, renamed the files with "masks" in their name. let me know if I should break it up further.

cameel · 2025-05-16T19:38:39Z

libevmasm/ConstantOptimiser.cpp

+	if (numberEncodingSize(~_value) < numberEncodingSize(_value) &&
+		(lowZeros+highOnes < 256 || highOnes > 16))
 		// Negated is shorter to represent
 		return findRepresentation(~_value) + AssemblyItems{Instruction::NOT};


This method overlaps the else as well. It could use a comment explaining why the case you're excluding is better handled by else. But overall I think it would be more robust not to have to make assumptions and just try them both, then choose the better one.

the diff is a little goofy here. The negation logic comes from the original code:

solidity/libevmasm/ConstantOptimiser.cpp

Line 256 in b8e81b9

else if (numberEncodingSize(~_value) < numberEncodingSize(_value))

What I did was tweak it with the extra && to make it better in some corner cases. Added an example in the code.

Mask generator optimizations

github-actions bot added the external contribution ⭐ label Mar 12, 2025

moh-eulith mentioned this pull request Mar 12, 2025

Mask generation optimization #15870

Open

ekpyron reviewed Mar 12, 2025

View reviewed changes

This comment was marked as outdated.

Sign in to view

moh-eulith force-pushed the mask_generation branch from 3872849 to 7f45522 Compare March 14, 2025 16:17

moh-eulith force-pushed the mask_generation branch from 7f45522 to 15226e0 Compare March 14, 2025 18:27

cameel added the optimizer label Mar 15, 2025

cameel reviewed Mar 15, 2025

View reviewed changes

moh-eulith force-pushed the mask_generation branch from c0ec242 to c78e807 Compare March 15, 2025 02:49

moh-eulith force-pushed the mask_generation branch 3 times, most recently from 4faac1a to 1afea11 Compare April 4, 2025 04:28

clonker mentioned this pull request Apr 4, 2025

Fix macos build #15982

Merged

moh-eulith force-pushed the mask_generation branch 3 times, most recently from ba28652 to 3a4054e Compare April 10, 2025 15:21

moh-eulith force-pushed the mask_generation branch from 3a4054e to de8b753 Compare April 14, 2025 16:51

cameel mentioned this pull request Apr 19, 2025

evmasm test case #16012

Merged

moh-eulith force-pushed the mask_generation branch 4 times, most recently from 4a1f782 to 797057a Compare April 24, 2025 00:27

cameel mentioned this pull request Apr 25, 2025

Subassembly syntax for plain EVM assembly tests #16021

Merged

moh-eulith force-pushed the mask_generation branch from 797057a to 61ff4fe Compare May 7, 2025 21:49

moh-eulith requested review from cameel and ekpyron May 7, 2025 23:02

moh-eulith force-pushed the mask_generation branch from 61ff4fe to 1f6e2fc Compare May 16, 2025 15:39

cameel reviewed May 16, 2025

View reviewed changes

moh-eulith force-pushed the mask_generation branch from dec4453 to 658941f Compare May 19, 2025 20:30

moh-eulith requested a review from cameel May 19, 2025 20:59

moh-eulith force-pushed the mask_generation branch from 658941f to c69ec73 Compare May 22, 2025 03:09

moh-eulith force-pushed the mask_generation branch from c69ec73 to 55cf9e3 Compare June 3, 2025 00:20

Mohammad Rezaei added 3 commits June 11, 2025 12:24

constant optimizer assembly tests

40cf83e

constant optimizer: a semantic test

4170f4f

Mask generator optimizations

formatting and comments

198d290

moh-eulith force-pushed the mask_generation branch from 55cf9e3 to 198d290 Compare June 12, 2025 16:48

		@@ -4,7 +4,7 @@ Language Features:


		Compiler Features:

		* Optimized constant generation for masks. With low compiler runs (optimized for size), reduces the byte code length and gas.

	* Optimized constant generation for masks. With low compiler runs (optimized for size), reduces the byte code length and gas.
	* Constant Optimizer: Compute masks using shifts when optimizing for size.

-	unsigned lowZeros = 0;
-	unsigned highOnes = 0;
-	for (; ((_value >> lowZeros) & 1) == 0 && lowZeros < 256; lowZeros++) {}
-	for (; ((_value >> (lowZeros + highOnes)) & 1) == 1 && highOnes < 256; highOnes++) {}
+	//                                        high ones      low zeros
+	//                                       |----------||--------------|
+	// 0x000000000000000000000000000000000000ffffffffffff0000000000000000
+	unsigned lowZeros = 0;
+	unsigned highOnes = 0;
+	while ((_value >> lowZeros) & 1 == 0 && lowZeros < 256)
+		++lowZeros;
+	while ((_value >> (lowZeros + highOnes)) & 1 == 1 && highOnes < 256)
+		++highOnes;

Mask generator optimizations #15935

Are you sure you want to change the base?

Mask generator optimizations #15935

Uh oh!

Conversation

moh-eulith commented Mar 12, 2025

Uh oh!

github-actions bot commented Mar 12, 2025

Uh oh!

moh-eulith commented Mar 12, 2025

Uh oh!

ekpyron left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ekpyron commented Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ekpyron commented Mar 12, 2025

Uh oh!

cameel commented Mar 12, 2025

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

cameel commented Mar 12, 2025

Uh oh!

ekpyron commented Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

moh-eulith commented Mar 13, 2025

Uh oh!

moh-eulith commented Mar 14, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

moh-eulith Mar 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cameel Mar 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cameel Mar 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cameel commented Mar 15, 2025

Uh oh!

cameel commented Mar 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

moh-eulith commented Mar 15, 2025

Uh oh!

moh-eulith commented Mar 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ir-optimize-evm+yul

ir-optimize-evm+yul

Uh oh!

cameel commented Mar 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

moh-eulith commented Mar 15, 2025

Uh oh!

cameel commented Apr 10, 2025

Uh oh!

moh-eulith commented Apr 24, 2025

Uh oh!

ekpyron commented Mar 12, 2025 •

edited

Loading

ekpyron commented Mar 12, 2025 •

edited

Loading

moh-eulith Mar 15, 2025 •

edited

Loading

cameel Mar 15, 2025 •

edited

Loading

cameel Mar 15, 2025 •

edited

Loading

cameel commented Mar 15, 2025 •

edited

Loading

moh-eulith commented Mar 15, 2025 •

edited

Loading

`ir-optimize-evm+yul`

`ir-optimize-evm+yul`

cameel commented Mar 15, 2025 •

edited

Loading

cameel left a comment •

edited

Loading

cameel May 16, 2025 •

edited

Loading

cameel May 16, 2025 •

edited

Loading

cameel May 16, 2025 •

edited

Loading