-
Notifications
You must be signed in to change notification settings - Fork 1.4k
[ntuple] Check type in RValue::GetRef<T>
and GetPtr<T>
(version 2)
#19544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ntuple] Check type in RValue::GetRef<T>
and GetPtr<T>
(version 2)
#19544
Conversation
For future reference, here are the microbenchmarks cited in the commit message: #include <Math/Vector4D.h>
#include <ROOT/REntry.hxx>
#include <ROOT/RFieldBase.hxx>
#include <ROOT/RNTupleModel.hxx>
#include <benchmark/benchmark.h>
#include <cstdint>
#include <utility>
static void REntry_GetPtr_int32(benchmark::State& state) {
auto model = ROOT::RNTupleModel::CreateBare();
auto field = ROOT::RFieldBase::Create("i32", "std::int32_t").Unwrap();
model->AddField(std::move(field));
model->Freeze();
auto token = model->GetToken("i32");
auto entry = model->CreateEntry();
for (auto _ : state) {
benchmark::DoNotOptimize(entry->GetPtr<std::int32_t>(token));
}
}
BENCHMARK(REntry_GetPtr_int32);
static void REntry_GetPtr_LorentzVector(benchmark::State& state) {
auto model = ROOT::RNTupleModel::CreateBare();
auto field = ROOT::RFieldBase::Create("v", "ROOT::Math::PtEtaPhiMVector").Unwrap();
model->AddField(std::move(field));
model->Freeze();
auto token = model->GetToken("v");
auto entry = model->CreateEntry();
for (auto _ : state) {
benchmark::DoNotOptimize(entry->GetPtr<ROOT::Math::PtEtaPhiMVector>(token));
}
}
BENCHMARK(REntry_GetPtr_LorentzVector);
static void RValue_GetPtr_int32(benchmark::State& state) {
auto field = ROOT::RFieldBase::Create("i32", "std::int32_t").Unwrap();
auto value = field->CreateValue();
for (auto _ : state) {
benchmark::DoNotOptimize(value.GetPtr<std::int32_t>());
}
}
BENCHMARK(RValue_GetPtr_int32);
static void RValue_GetPtr_LorentzVector(benchmark::State& state) {
auto field = ROOT::RFieldBase::Create("v", "ROOT::Math::PtEtaPhiMVector").Unwrap();
auto value = field->CreateValue();
for (auto _ : state) {
benchmark::DoNotOptimize(value.GetPtr<ROOT::Math::PtEtaPhiMVector>());
}
}
BENCHMARK(RValue_GetPtr_LorentzVector);
BENCHMARK_MAIN(); |
Test Results 21 files 21 suites 3d 11h 23m 40s ⏱️ Results for commit 5464821. ♻️ This comment has been updated with latest results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
This avoids extra type checks.
This avoids extra type checks.
This is a bit tricky because RFieldBase.hxx cannot depend on RField specializations in RField.hxx. We considered factoring TypeName() into a separate RFieldTypeName trait, but that would have still required including a bunch of STL headers to specialize for all supported containers. In the end, we settled on taking the typeid and passing the std::type_info to Internal::GetRenormalizedTypeName. However, this comes with a run time cost, especially for fundamental types. In a microbenchmark, I measure 24ns per (successful) call to REntry::GetPtr<std::int32_t> while a naive implementation of always calling GetRenormalizedTypeName takes ~190ns per RValue::GetPtr. For more complex types, for example ROOT::Math::PtEtaPhiMVector with a number of templates, the difference is less pronounced if at all measurable with around 2000ns both. To avoid this cost on repeated calls, we cache the std::type_info of matching types and try if it's equal to the typeid, which is basically a pointer comparison. This brings the (amortized) cost down to 8ns for any type! In the future, we will want to reuse the caching approach of RValue::EnsureMatchingType also for REntry. Closes root-project#18316 Co-authored-by: Jakob Blomer <[email protected]>
This allows concurrent calls to const methods that modifify this mutable member. Unfortunately, std::atomic is neither copyable nor movable and requires custom copy/move constructors and operators for RValue.
3566340
to
5464821
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool thanks!
This is a bit tricky because
RFieldBase.hxx
cannot depend onRField
specializations inRField.hxx
. We considered factoringTypeName()
into a separateRFieldTypeName
trait, but that would have still required including a bunch of STL headers to specialize for all supported containers. In the end, we settled on taking thetypeid
and passing thestd::type_info
toInternal::GetRenormalizedTypeName
.However, this comes with a run time cost, especially for fundamental types. In a microbenchmark, I measure 24ns per (successful) call to
REntry::GetPtr<std::int32_t>
while a naive implementation of always callingGetRenormalizedTypeName
takes ~190ns perRValue::GetPtr
. For more complex types, for exampleROOT::Math::PtEtaPhiMVector
with a number of templates, the difference is less pronounced if at all measurable with around 2000ns both.To avoid this cost on repeated calls, we cache the
std::type_info
of matching types and try if it's equal to thetypeid
, which is basically a pointer comparison. This brings the (amortized) cost down to 8ns for any type! In the future, we will want to reuse the caching approach ofRValue::EnsureMatchingType
also forREntry
.Closes #18316