-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MJF in TimeLeft #7073
Comments
Before investigating the code, I am surprised that:
|
LHCb was also ticketed: https://ggus.eu/index.php?mode=ticket_info&ticket_id=162429 |
MJF is used when nothing else is found (for what regards TimeLeft). So, it basically won't be used when there's a known batch system. When this was initially coded we thought of switching the priority once MJF would have been deployed ~everywhere, but this never happened and the MJF project reached a slow death. I will reply in LHCb's ticket. |
Just to say that in the UK it's not used, apparently not even by Manchester who invented it. |
I have read https://ggus.eu/index.php?mode=ticket_info&ticket_id=162429, and https://twiki.cern.ch/twiki/bin/view/LCG/WLCGOpsMinutes230608#HEPScore_status_update I understand MJF was abandoned because the "numbers being published on the WNs are too unreliable in practice", Maybe it is worthwhile to respect "downtime", for it would not be filled usually? |
See https://ggus.eu/index.php?mode=ticket_info&ticket_id=162431
DESY claims that our DIRAC pilots do not respect the MACHINEFEATURES/shutdowntime they set,
referring to the documents dated in early 2016:
https://hepsoftwarefoundation.org/notes/HSF-TN-2016-02.pdf
https://twiki.cern.ch/twiki/bin/view/LCG/WMTEGEnvironmentVariables
Looking into the code:
DIRAC/src/DIRAC/Resources/Computing/BatchSystems/TimeLeft/TimeLeft.py
Line 146 in 1b36402
MJFResourceUsage seems to be used only when the batch system is unknown, is that correct?
The pilots running at DESY finds the batch system is HTCondor, and the log reads
There have been some discussions in the past
#4544 JobAgent TimeLeft computation: definitions, multi-core environments, batch system based on wallclock time
#4788 HTCondor TimeLeft module
MJF is not used by the pilot jobs on HTCondor by intention?
Not only for getting wallclock time limit, but even for downtime?
The text was updated successfully, but these errors were encountered: