You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While this is not the only way this can happen, we think the most likely cause for a database entry for a file with zero replicas was the following sequence of events:
A number of files were lost on disk
The file catalogue was tidied up using
dirac-dms-remove-catalog-replicas specifying the misbehaving storage
element
If a lost file was the last replica, dirac-dms-remove-catalog-replicas leaves an entry in the
database behind
This entry can be removed with dirac-dms-remove-catalog-files, but given
an (all too common) scenario of files being lost as a specific storage element
dirac-dms-remove-catalog-replicas seems the obvious choice of tools to tidy up the catalogue as other replicas are unaffected.
This then would require an extra check by the user to see if it was
the last replica. They will not be aware of this. The differences in
output are also rather subtle (see below) and do not indicate a problem even to an experienced admin.
This scenario can be reproduced with the following sequence:
(base) gridpp_py3 > dirac-dms-lfn-replicas/t2k.org/user/d/reptest.txt
No output
That's rather subtle and you only notice that something is amiss once you realize that the output for a truly non-existent LFN is different:
(base) gridpp_py3 > dirac-dms-lfn-replicas /t2k.org/user/d/reptest.txt0
LFN StorageElement URL
===============================================
/t2k.org/user/d/reptest.txt0 Unknown No such file or directory
Apply the nuclear option:
(base) gridpp_py3 > dirac-dms-remove-catalog-files /t2k.org/user/d/reptest.txt
Successfully removed 1 catalog files.
(base) lx04:2023_May_16_1234_gridpp_py3 > dirac-dms-lfn-replicas/t2k.org/user/d/reptest.txt
LFN StorageElement URL
==============================================
/t2k.org/user/d/reptest.txt Unknown No such file or directory
There are probably other ways database entries can get into this state, but this is one of the more likely scenarios.
Can you please fix the following issues:
dirac-dms-remove-catalog-replicas should delete the file catalog entry if the replica it removes is the last of its kind (1 bonus point)
Instead of returning "No output" dirac-dms-lfn-replicas should return an error if there are zero replicas of a file as this is an error state (i.e. not foreseen in the DIRAC code) (1 bonus point)
Once your bonus stamp card is full, you can claim your free beer.
Sorry, my google filter ate the reply.
dirac-dms-remove-replicas was ruled out as it would not be able to remove the replica as the file is already gone on disk.
Related to #7075
While this is not the only way this can happen, we think the most likely cause for a database entry for a file with zero replicas was the following sequence of events:
dirac-dms-remove-catalog-replicas specifying the misbehaving storage
element
database behind
an (all too common) scenario of files being lost as a specific storage element
dirac-dms-remove-catalog-replicas seems the obvious choice of tools to tidy up the catalogue as other replicas are unaffected.
the last replica. They will not be aware of this. The differences in
output are also rather subtle (see below) and do not indicate a problem even to an experienced admin.
This scenario can be reproduced with the following sequence:
[now sneakily delete file on disk to simulate storage meltdown]
Try to clean up the catalog:
It still knows the LFN:
That's rather subtle and you only notice that something is amiss once you realize that the output for a truly non-existent LFN is different:
Apply the nuclear option:
There are probably other ways database entries can get into this state, but this is one of the more likely scenarios.
Can you please fix the following issues:
Once your bonus stamp card is full, you can claim your free beer.
Tagging @sfayer so he knows it's all filed.
The text was updated successfully, but these errors were encountered: