You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Based on WMAgent cycle 2.3.4, the WorkflowUpdater component actually crashes when it cannot find a Rucio DID on Rucio Server. Here is a traceback of the component [1]
I have only seen this in agents connected to Rucio Integration, but it will eventually happen in the production agents too.
How to reproduce it
Run a workflow requesting a pileup that either no longer exists, or exists only in a different Rucio instance.
Expected behavior
We should log a clear ERROR message in the logs and move on with other pileup containers that might have to be updated in that cycle. In other words, we should not:
crash the component
or skip the whole cycle
Additional context and error message
[1]
2024-06-28 03:20:39,561:139769515988736:INFO:WorkflowUpdaterPoller:Fetching blocks for pileup container: /Neutrino_E-10_gun/Run3Summer21PrePremix-Summer22_124X_mcRun3_2022_realistic_v11-v2/PREMIX
2024-06-28 03:20:39,743:139769515988736:INFO:Timers:Rucio block resolution took 0.343 seconds to complete
2024-06-28 03:20:39,743:139769515988736:ERROR:WorkflowUpdaterPoller:Caught unexpected exception in WorkflowUpdater. Details:
<@========== WMException Start ==========@>
Exception Class: WMRucioDIDNotFoundException
Message: Data identifier not found in Rucio: /Neutrino_E-10_gun/Run3Summer21PrePremix-Summer22_124X_mcRun3_2022_realistic_v11-v2/PREMIX. Error: Data identifier not found.
Details: Data identifier 'cms:/Neutrino_E-10_gun/Run3Summer21PrePremix-Summer22_124X_mcRun3_2022_realistic_v11-v2/PREMIX' not found
ClassName : None
ModuleName : WMCore.Services.Rucio.Rucio
MethodName : isContainer
ClassInstance : None
FileName : /usr/local/lib/python3.8/site-packages/WMCore/Services/Rucio/Rucio.py
LineNumber : 801
ErrorNr : 0
Traceback:
File "/usr/local/lib/python3.8/site-packages/WMCore/Services/Rucio/Rucio.py", line 798, in isContainer
response = self.cli.get_did(scope=scope, name=didName)
File "/usr/local/lib/python3.8/site-packages/rucio/client/didclient.py", line 424, in get_did
raise exc_cls(exc_msg)
<@---------- WMException End ----------@>
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/WMCore/Services/Rucio/Rucio.py", line 798, in isContainer
response = self.cli.get_did(scope=scope, name=didName)
File "/usr/local/lib/python3.8/site-packages/rucio/client/didclient.py", line 424, in get_did
raise exc_cls(exc_msg)
rucio.common.exception.DataIdentifierNotFound: Data identifier not found.
Details: Data identifier 'cms:/Neutrino_E-10_gun/Run3Summer21PrePremix-Summer22_124X_mcRun3_2022_realistic_v11-v2/PREMIX' not found
The text was updated successfully, but these errors were encountered:
Impact of the bug
WMAgent
Describe the bug
Based on WMAgent cycle 2.3.4, the WorkflowUpdater component actually crashes when it cannot find a Rucio DID on Rucio Server. Here is a traceback of the component [1]
I have only seen this in agents connected to Rucio Integration, but it will eventually happen in the production agents too.
How to reproduce it
Run a workflow requesting a pileup that either no longer exists, or exists only in a different Rucio instance.
Expected behavior
We should log a clear ERROR message in the logs and move on with other pileup containers that might have to be updated in that cycle. In other words, we should not:
Additional context and error message
[1]
The text was updated successfully, but these errors were encountered: