-
Notifications
You must be signed in to change notification settings - Fork 327
test: Handle full state tests in evmone-bench
#1043
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
0f7876d
to
ca95f71
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1043 +/- ##
==========================================
+ Coverage 94.54% 94.69% +0.14%
==========================================
Files 175 175
Lines 19702 19672 -30
==========================================
Hits 18628 18628
+ Misses 1074 1044 -30
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
bd6083a
to
9588b72
Compare
evmone-bench
if (const auto it = registered_vms.find("advanced"); it != registered_vms.end()) | ||
advanced_vm = &it->second; | ||
if (const auto it = registered_vms.find("baseline"); it != registered_vms.end()) | ||
baseline_vm = &it->second; | ||
if (const auto it = registered_vms.find("bnocgoto"); it != registered_vms.end()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this VM removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because now all VMs are tested in the loop below. First two VMs are used to perform analysis. The bnocgoto
did not have analysis run before. After moving all VMs to the loop this became unused.
test/bench/helpers.hpp
Outdated
#include <evmone/vm.hpp> | ||
|
||
namespace evmone::test | ||
{ | ||
extern std::map<std::string_view, evmc::VM> registered_vms; | ||
|
||
constexpr auto default_revision = EVMC_ISTANBUL; | ||
constexpr auto default_revision = EVMC_PRAGUE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe don't change this value if not needed. Later we will need to update synthetic tests or remove them in favor of EEST.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Legcy change. Reverting
test/bench/helpers.hpp
Outdated
auto iteration_gas_used = int64_t{0}; | ||
for (auto _ : state) | ||
{ | ||
const auto tx_props_or_error = state::validate_transaction(pre_state, block_info, tx, rev, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should remove transaction validation from the benchmark. You should also add a TODO to later register "validation" subcase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It only validates the test but the time needed to execute this is not added to total benchmark time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Sorry misunderstood. I was referring the run_state_test
. Validation can be removed.
6132839
to
63b9809
Compare
- Load benchmarks as proper state test. - Support single file path. - Remove support for benchmarking raw bytecode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the benchmarking infrastructure within evmone to support state tests by loading state test JSON files, removing old raw bytecode support, and ensuring that state tests pass before running benchmarks.
- Updated CMake builds to include new test runner sources and link necessary GTest components.
- Removed legacy benchmarking functions and added a new bench_transition helper to facilitate state transitions.
- Refactored benchmark registration and argument parsing to support JSON-based state tests.
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
File | Description |
---|---|
test/statetest/CMakeLists.txt | Removed raw bytecode support; added statetest_runner.cpp to the build. |
test/bench/helpers.hpp | Removed legacy execution functions and added bench_transition. |
test/bench/bench.cpp | Refactored benchmark registration, argument parsing, and state tests. |
const auto name = "advanced/execute/" + case_name; | ||
RegisterBenchmark(name, [&vm = *advanced_vm, &b, &input](State& state) { | ||
bench_advanced_execute(state, vm, b.code, input.input, input.expected_output); | ||
RegisterBenchmark("advanced/analyse/" + b.name, [code, &rev](State& state) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider capturing 'rev' by value instead of by reference in the lambda to ensure that its value is preserved correctly in the benchmark callback.
RegisterBenchmark("advanced/analyse/" + b.name, [code, &rev](State& state) { | |
RegisterBenchmark("advanced/analyse/" + b.name, [code, rev](State& state) { |
Copilot uses AI. Check for mistakes.
constexpr auto bench_baseline_execute = | ||
bench_execute<ExecutionState, baseline::CodeAnalysis, baseline_execute, baseline_analyse>; | ||
using benchmark::Counter; | ||
state.counters["gas_used"] = Counter(static_cast<double>(iteration_gas_used)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Review whether the 'gas_used' counter should accumulate the total gas from all iterations rather than only reflecting the gas from the final iteration. Clarify the intention to prevent any misinterpretation of benchmark results.
state.counters["gas_used"] = Counter(static_cast<double>(iteration_gas_used)); | |
state.counters["gas_used"] = Counter(static_cast<double>(total_gas_used)); |
Copilot uses AI. Check for mistakes.
This PR implements proper support for evm benchmarking in a form of state test file.
json
file and run benchmarks on tests defined in file.