-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTCondor site adapter fails on terminating an already terminated resource #82
Comments
the easiest to way to fix this, is to check if tardis/tardis/adapters/sites/htcondor.py Lines 165 to 178 in 78b24be
tardis/tardis/adapters/sites/htcondor.py Lines 176 to 177 in 78b24be
CommandExecutionFailure is thrown always if the exit code is different from 0.
@olifre, @wiene: Could you try to call Thanks, |
@giffels, here is the requested (surprising) test result:
|
would it be possible to patch your installation in the following way, please? tardis/tardis/adapters/sites/htcondor.py Lines 137 to 139 in cbec081
=> pattern = re.compile(r"^.*?(?P<ClusterId>\d+).*$", flags=re.MULTILINE)
try:
response = AttributeDict(pattern.search(response.stdout).groupdict())
except AttributeError:
logging.error(f"Pattern search failed. Output of {terminate_command} is: {response}")
raise
return self.handle_response(response) and send us the output of the log entry? Thanks and best regards, |
@giffels, I do not know whether this is good or bad news but we are not able to reproduce the problem anymore. |
In case a resource has been released already by an operator, TARDIS seems to fail releasing it again. See stack trace:
Thanks to Peter for reporting.
The text was updated successfully, but these errors were encountered: