test: Handle full state tests in `evmone-bench` #1043

rodiazet · 2024-10-07T09:36:07Z

This PR implements proper support for evm benchmarking in a form of state test file.

Remove old style benchmarking support using raw bytecode in file
Load state test json file and run benchmarks on tests defined in file.
Run state tests before benchmarking to make sure that it passes.

codecov · 2024-10-09T09:37:30Z

Codecov Report

Attention: Patch coverage is 0% with 84 lines in your changes missing coverage. Please review.

Project coverage is 94.69%. Comparing base (a9d5bfe) to head (a815b5f).

Files with missing lines	Patch %	Lines
test/bench/bench.cpp	0.00%	73 Missing ⚠️
test/bench/helpers.hpp	0.00%	11 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1043      +/-   ##
==========================================
+ Coverage   94.54%   94.69%   +0.14%     
==========================================
  Files         175      175              
  Lines       19702    19672      -30     
==========================================
  Hits        18628    18628              
+ Misses       1074     1044      -30

Flag	Coverage Δ
eest_gmp	`15.29% <0.00%> (+0.02%)`	⬆️
eof_execution_spec_tests	`19.88% <0.00%> (+0.03%)`	⬆️
ethereum_tests	`21.49% <0.00%> (+0.03%)`	⬆️
ethereum_tests_silkpre	`18.30% <0.00%> (+0.02%)`	⬆️
execution_spec_tests	`18.57% <0.00%> (+0.02%)`	⬆️
unittests	`91.99% <0.00%> (+0.14%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
test/bench/helpers.hpp	`0.00% <0.00%> (ø)`
test/bench/bench.cpp	`0.00% <0.00%> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

test/bench/bench.cpp

chfast · 2025-05-19T13:01:40Z

test/bench/bench.cpp

    if (const auto it = registered_vms.find("advanced"); it != registered_vms.end())
        advanced_vm = &it->second;
    if (const auto it = registered_vms.find("baseline"); it != registered_vms.end())
        baseline_vm = &it->second;
-    if (const auto it = registered_vms.find("bnocgoto"); it != registered_vms.end())


Why was this VM removed?

Because now all VMs are tested in the loop below. First two VMs are used to perform analysis. The bnocgoto did not have analysis run before. After moving all VMs to the loop this became unused.

chfast · 2025-05-19T13:02:35Z

test/bench/helpers.hpp

 #include <evmone/vm.hpp>

 namespace evmone::test
 {
 extern std::map<std::string_view, evmc::VM> registered_vms;

-constexpr auto default_revision = EVMC_ISTANBUL;
+constexpr auto default_revision = EVMC_PRAGUE;


Maybe don't change this value if not needed. Later we will need to update synthetic tests or remove them in favor of EEST.

Legcy change. Reverting

chfast · 2025-05-19T13:03:45Z

test/bench/helpers.hpp

+    auto iteration_gas_used = int64_t{0};
+    for (auto _ : state)
+    {
+        const auto tx_props_or_error = state::validate_transaction(pre_state, block_info, tx, rev,


I think we should remove transaction validation from the benchmark. You should also add a TODO to later register "validation" subcase.

It only validates the test but the time needed to execute this is not added to total benchmark time.

OK. Sorry misunderstood. I was referring the run_state_test. Validation can be removed.

- Load benchmarks as proper state test. - Support single file path. - Remove support for benchmarking raw bytecode.

Copilot

Pull Request Overview

This PR refactors the benchmarking infrastructure within evmone to support state tests by loading state test JSON files, removing old raw bytecode support, and ensuring that state tests pass before running benchmarks.

Updated CMake builds to include new test runner sources and link necessary GTest components.
Removed legacy benchmarking functions and added a new bench_transition helper to facilitate state transitions.
Refactored benchmark registration and argument parsing to support JSON-based state tests.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
test/statetest/CMakeLists.txt	Removed raw bytecode support; added statetest_runner.cpp to the build.
test/bench/helpers.hpp	Removed legacy execution functions and added bench_transition.
test/bench/bench.cpp	Refactored benchmark registration, argument parsing, and state tests.

Copilot · 2025-05-20T14:18:27Z

test/bench/bench.cpp

-                const auto name = "advanced/execute/" + case_name;
-                RegisterBenchmark(name, [&vm = *advanced_vm, &b, &input](State& state) {
-                    bench_advanced_execute(state, vm, b.code, input.input, input.expected_output);
+                RegisterBenchmark("advanced/analyse/" + b.name, [code, &rev](State& state) {


Consider capturing 'rev' by value instead of by reference in the lambda to ensure that its value is preserved correctly in the benchmark callback.

Suggested change

RegisterBenchmark("advanced/analyse/" + b.name, [code, &rev](State& state) {

RegisterBenchmark("advanced/analyse/" + b.name, [code, rev](State& state) {

Copilot · 2025-05-20T14:18:27Z

test/bench/helpers.hpp

-constexpr auto bench_baseline_execute =
-    bench_execute<ExecutionState, baseline::CodeAnalysis, baseline_execute, baseline_analyse>;
+    using benchmark::Counter;
+    state.counters["gas_used"] = Counter(static_cast<double>(iteration_gas_used));


[nitpick] Review whether the 'gas_used' counter should accumulate the total gas from all iterations rather than only reflecting the gas from the final iteration. Clarify the intention to prevent any misinterpretation of benchmark results.

Suggested change

state.counters["gas_used"] = Counter(static_cast<double>(iteration_gas_used));

state.counters["gas_used"] = Counter(static_cast<double>(total_gas_used));

rodiazet force-pushed the bench-fix branch 4 times, most recently from 0f7876d to ca95f71 Compare October 9, 2024 09:33

rodiazet changed the title ~~Bench fix~~ test: Fix state test format benchmarking Oct 9, 2024

chfast added the tests Testing infrastructure label May 14, 2025

chfast mentioned this pull request May 14, 2025

Improve evmone-bench execution model #1039

Open

rodiazet force-pushed the bench-fix branch 5 times, most recently from bd6083a to 9588b72 Compare May 19, 2025 08:59

rodiazet marked this pull request as ready for review May 19, 2025 09:05

rodiazet requested a review from chfast May 19, 2025 09:05

chfast force-pushed the bench-fix branch from 9588b72 to bf6cccc Compare May 19, 2025 12:49

chfast changed the title ~~test: Fix state test format benchmarking~~ test: Handle full state tests in evmone-bench May 19, 2025

chfast requested changes May 19, 2025

View reviewed changes

rodiazet force-pushed the bench-fix branch 5 times, most recently from 6132839 to 63b9809 Compare May 20, 2025 14:00

rodiazet added 2 commits May 20, 2025 16:12

test: Move statetest_runner.cpp to statetestutils

74a4607

test: Handle full state tests in evmone-becnh

a815b5f

- Load benchmarks as proper state test. - Support single file path. - Remove support for benchmarking raw bytecode.

rodiazet force-pushed the bench-fix branch from 63b9809 to a815b5f Compare May 20, 2025 14:12

chfast requested a review from Copilot May 20, 2025 14:17

Copilot AI reviewed May 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test: Handle full state tests in `evmone-bench` #1043

test: Handle full state tests in `evmone-bench` #1043

Uh oh!

rodiazet commented Oct 7, 2024 •

edited

Loading

Uh oh!

codecov bot commented Oct 9, 2024 •

edited

Loading

Uh oh!

Uh oh!

chfast May 19, 2025

Uh oh!

rodiazet May 20, 2025

Uh oh!

chfast May 19, 2025

Uh oh!

rodiazet May 20, 2025

Uh oh!

chfast May 19, 2025

Uh oh!

rodiazet May 20, 2025

Uh oh!

rodiazet May 20, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI May 20, 2025

Uh oh!

Copilot AI May 20, 2025

Uh oh!

Uh oh!

	RegisterBenchmark("advanced/analyse/" + b.name, [code, &rev](State& state) {
	RegisterBenchmark("advanced/analyse/" + b.name, [code, rev](State& state) {

	state.counters["gas_used"] = Counter(static_cast<double>(iteration_gas_used));
	state.counters["gas_used"] = Counter(static_cast<double>(total_gas_used));

test: Handle full state tests in evmone-bench #1043

Are you sure you want to change the base?

test: Handle full state tests in evmone-bench #1043

Uh oh!

Conversation

rodiazet commented Oct 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Oct 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

chfast May 19, 2025

Choose a reason for hiding this comment

Uh oh!

rodiazet May 20, 2025

Choose a reason for hiding this comment

Uh oh!

chfast May 19, 2025

Choose a reason for hiding this comment

Uh oh!

rodiazet May 20, 2025

Choose a reason for hiding this comment

Uh oh!

chfast May 19, 2025

Choose a reason for hiding this comment

Uh oh!

rodiazet May 20, 2025

Choose a reason for hiding this comment

Uh oh!

rodiazet May 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI May 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI May 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

test: Handle full state tests in `evmone-bench` #1043

test: Handle full state tests in `evmone-bench` #1043

rodiazet commented Oct 7, 2024 •

edited

Loading

codecov bot commented Oct 9, 2024 •

edited

Loading