Skip to content

Commit

Permalink
MGM/FST: Adding retry mechanism for failed CTA Frontend DNS resolution
Browse files Browse the repository at this point in the history
Joao Afonso authored and Elvin Alin Sindrilaru committed Nov 26, 2024
1 parent c6953f2 commit 64839cf
Showing 3 changed files with 13 additions and 3 deletions.
2 changes: 1 addition & 1 deletion common/xrootd-ssi-protobuf-interface
Submodule xrootd-ssi-protobuf-interface updated from ccadd7 to b2b8ac
7 changes: 6 additions & 1 deletion fst/XrdFstOfsFile.cc
Original file line number Diff line number Diff line change
@@ -3893,7 +3893,12 @@ XrdFstOfsFile::NotifyProtoWfEndPointClosew(uint64_t file_id,
// If static initialization throws an exception, it will be retried next time
static XrdSsiPbServiceType service(endPoint, resource, config);
auto sentAt = std::chrono::steady_clock::now();
service.Send(request, response);
try {
service.Send(request, response, false);
} catch (std::runtime_error& err) {
eos_static_err("Could not send request to outside service. Retrying with DNS cache refresh.");
service.Send(request, response, true);
}
auto receivedAt = std::chrono::steady_clock::now();
auto timeSpent = std::chrono::duration_cast<std::chrono::milliseconds>
(receivedAt - sentAt);
7 changes: 6 additions & 1 deletion mgm/WFE.cc
Original file line number Diff line number Diff line change
@@ -2800,7 +2800,12 @@ WFE::Job::SendProtoWFRequest(Job* jobPtr, const std::string& fullPath,
// Send the request
try {
const auto sentAt = std::chrono::steady_clock::now();
service.Send(request, response);
try {
service.Send(request, response, false);
} catch (std::runtime_error& err) {
eos_static_err("msg=\"Could not send SSI protocol buffer request to outside service. Retrying with DNS cache refresh.\"");
service.Send(request, response, true);
}
const auto receivedAt = std::chrono::steady_clock::now();
const auto timeSpentMilliseconds =
std::chrono::duration_cast<std::chrono::milliseconds> (receivedAt - sentAt);

0 comments on commit 64839cf

Please sign in to comment.