-
Notifications
You must be signed in to change notification settings - Fork 1.4k
[ntuple] Add some internal utilities to the storage layer #19904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Test Results 22 files 22 suites 3d 20h 34m 47s ⏱️ For more details on these failures, see this check. Results for commit 8cc71a8. ♻️ This comment has been updated with latest results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need GetUnderlyingDirectory()
? Since the attribute anchor is a hidden key, we can just store it in the ROOT file's root directory.
Not sure...the reason I need it in the full PR is to create the AttrSetWriter which then uses it create a PageSinkFile using it. Even if we wanted to store the key in the root directory we'd still need the underlying TFile for it... |
3675b97
to
8cc71a8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments inline. I'm not actually sure it helps to review internal methods and additions to the RPageStorage
abstraction layer without seeing the immediate need. Maybe one has to look at the final PR all the time, but that is quite some mental overhead. It feels like we are now vertically building the implementation, instead of in logical chunks...
std::unique_ptr<RPageSink> CloneWithNewRNTuple(std::string_view newName) const final | ||
{ | ||
return fInnerSink->CloneWithNewRNTuple(newName); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel the semantics are not well defined: As it stands, it will return a page sink that is not buffered anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely sure what should happen here: it's not necessarily true that if the parent sink is buffered then we also want to buffer the clone. Arguably, returning a non-buffered sink gives the most flexibility as the caller can decide whether to wrap it in a buffered sink or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would argue it's not a clone in this case... Maybe it's a matter of finding a more suitable name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't come up with anything better than CreateNewWithNewRNTuple
...being a private internal method it might be ok even though it's ugly.
{ | ||
auto pageSource = std::make_unique<RPageSourceFile>("", fFile->Clone(), options); | ||
pageSource->fAnchor = anchor; | ||
pageSource->fNTupleName = pageSource->fDescriptorBuilder.GetDescriptor().GetName(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this initialized? Above you are passing ""
for ntupleName
, is the header already loaded somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By "initialized" you mean attached? At the moment it needs to be done manually, like when you create a new PageSource. This is done in 2 places in the full PR: in ReadAttributeSet
, by the created Reader, and in RNTupleMerger::OpenAttributeSource
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somewhat: Unless I'm missing something, I think the effect of this line is always that pageSource->fNTupleName = "";
because the descriptor builder doesn't have more information (yet).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. It's indeed possible that that line is useless (I still need to verify), but isn't this the case also for CreateFromAnchor
then?
Will be used by the RNTupleAttributes.
Also update the outdated comment about GetNEntries()
8cc71a8
to
60b27af
Compare
Add a bunch of functions to poke through the abstraction of the storage classes (specifically, to get the underlying TDirectory and RMiniFileReader). These functionalities (all Internal) will be needed by the RNTuple Attributes (see the full PR for details)
Checklist: