[NSFS | NC | Glacier] Fix migrate hang under race condition #9244
+2
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Describe the Problem
Tape team discovered that sometimes,
noobaa-cli glacier migrate
noobaa-cli glacier restore
would hang indefinitely. Upon investigation it was found that it was the migrate process which was hanging but due to acquiring a cluster wide lock, it was blocking the restore command as well.The migrate process was hanging because it was trying to acquire an
EXCLUSIVE
lock on the migration lock but one of the NooBaa server process was not releasing the lock. We have a detection mechanism in persistent_logger which must release the lock eventually however this detection was failing. The root cause of the failure was that in an earlier PR of mine (#9183), I missed updating thefh_stat
information which is used by a part of this detection mechanism,Explain the Changes
This PR updates the code to ensure that the
fh_stat
value is updated correctly.Issues: Fixed #xxx / Gap #xxx
Testing Instructions:
NOOBAA_LOG_LEVEL="all"
in configrm -rf <LOG_DIR>/migrate.log && touch <LOG_DIR>/migrate.log
active file changed, closing for namespace: migrate
Summary by CodeRabbit