Skip to content

Conversation

@vepadulano
Copy link
Member

@vepadulano vepadulano commented Feb 21, 2025

Backport of #16173, #17544 and #17822 as needed by ATLAS, see 177th PPP meeting

Sometimes the `getxattr` call returns a valid string, but the string is
incomplete. Namely, part of the full URL to the file on EOS is missing (usually
the filename itself). Workaround by checking that the xurl string ends with the
filename and in case it is not present, avoid calling TFile::Open.

Co-authored-by: Jonas Hahnfeld <[email protected]>
@vepadulano vepadulano requested a review from dpiparo February 21, 2025 15:52
@vepadulano vepadulano self-assigned this Feb 21, 2025
@vepadulano vepadulano requested a review from pcanal as a code owner February 21, 2025 15:52
@vepadulano vepadulano changed the title [6.34] Fix interaction between RSampleInfo and redirected EOS paths [6.34] Silently fallback if full xroot path is not found on EOS Feb 21, 2025
@github-actions
Copy link

github-actions bot commented Feb 21, 2025

Test Results

    19 files      19 suites   4d 6h 38m 47s ⏱️
 2 685 tests  2 685 ✅ 0 💤 0 ❌
49 089 runs  49 089 ✅ 0 💤 0 ❌

Results for commit 8c05b8c.

♻️ This comment has been updated with latest results.

@vepadulano vepadulano requested a review from martamaja10 as a code owner March 3, 2025 12:38
When the input files are paths to FUSE-mounted EOS files, during the event loop
these paths will be redirected to the corresponding xroot EOS URL, in
TFile::Open.

This was causing a bad interaction with the sample metadata and the subsequent
usage in the event loop, e.g. through DefinePerSample. Specifically, we fill a
map with the metadata at construction time, which includes the input file paths
(without redirection). These will then be irretrievable during the event loop
since the map key will not correspond to the redirected path. Fix this by also
adding the redirected map during the filling of the map in ChangeSpec.
@vepadulano vepadulano force-pushed the tfile-open-eos-fallback-v634 branch from 24f0c29 to 8158b40 Compare March 3, 2025 12:51
vepadulano and others added 3 commits March 5, 2025 09:55
Fixes root-project#17820

Full analysis of the issue follows:

1. TChain::GetEntries calls TChain::LoadTree(TTree::kMaxEntries - 1)

2. Internally, this will set fReadEntry == -2, due to how LoadTree works and the
   fact it's trying to read beyond the last entry in the chain

3. This has an effect here https://github.com/root-project/root/blob/f33985dca2d6bc505e83498f7b4561ddd969be17/tree/tree/src/TChain.cxx#L1675-L1679
   where `t` is the current friend being traversed in the list of friends of the *chain*.
   We call `t->LoadTree(TTree::kMaxEntries - 1)`.

4. Inside the call to LoadTree on the friend, we reach this line
   https://github.com/root-project/root/blob/f33985dca2d6bc505e83498f7b4561ddd969be17/tree/tree/src/TChain.cxx#L1484
   where the fTree data member of the friend is invalidated.

5. Back to the LoadTree of the main chain, these lines
   https://github.com/root-project/root/blob/f33985dca2d6bc505e83498f7b4561ddd969be17/tree/tree/src/TChain.cxx#L1681-L1684
   would normally connect the fTree data member of the main chain to the current friend, so the list of friends
   is properly populated. But due to 4., `friend->GetTree()` will return nullptr so the connection won't happen.

This commit changes the implementation of TChain::GetEntries, so that
after computing the total number of entries the chain is brought back
to an invalid but recoverable state, and if the previous read entry was
at least zero, the cursor is moved to it.

Co-authored-by: Philippe Canal <[email protected]>
Avoid calling LoadTree when it will be a no-op and especially don't undo work we wont be able to
redo
Calling GetEntriesFast is enough to know if you reach past the end (if we are in the last
file then it will return the precise number or something too large if we are far).
Calling GetEntries can move the cursor of the chain and thus change the state of the parent.
@vepadulano vepadulano requested a review from bellenot as a code owner March 5, 2025 08:57
@vepadulano vepadulano changed the title [6.34] Silently fallback if full xroot path is not found on EOS [6.34] Backports for ATLAS FastFrames framework Mar 5, 2025
@vepadulano vepadulano merged commit 00afa9b into root-project:v6-34-00-patches Mar 5, 2025
41 of 42 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants