-
Notifications
You must be signed in to change notification settings - Fork 106
make delete volume idempotent when node is removed from cluster #413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
1671fcc to
9b11bec
Compare
Abhinandan-Purkait
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
pkg/lvm/volume.go
Outdated
| err = volbuilder.NewKubeclient().WithNamespace(LvmNamespace).Delete(volumeID) | ||
| if err == nil { | ||
| klog.Infof("deprovisioned volume %s", volumeID) | ||
| klog.Infof("deprovisioning volume %s, marking deletion timestamp", volumeID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do we mean by marking deletion timestamp, here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it means the deletion timestamp is being set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we call delete which just marks deletion timestamp as we have finaliser.
Shall i remove marking deletion timestamp ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep that bit out since it's implied, but fine either way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove it.
tiagolobocastro
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about existing volumes
pkg/driver/controller.go
Outdated
| } | ||
| present := isNodePresent(vol.Spec.OwnerNodeID) | ||
| if !present { | ||
| klog.Infof("Removing finalizer as node %s is not present in cluster", vol.Spec.OwnerNodeID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a warning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, will make it warning
|
|
||
| _, err := volbuilder.NewKubeclient().WithNamespace(LvmNamespace).Update(vol) | ||
| if err != nil { | ||
| klog.Infof("Finalizer removed successfully for %s", vol.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this as info or is debug fine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will not reach this code if node is present. Would be better to have it in log by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we already have the message from above where we call it if the node is not present?
Oh also this check seems reversed, if err != nil then we didn't remove the finalizer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, err != nil means we fail to remove finalizer, we return err. If not, we move on to check if lvmvolume CR is deleted.
pkg/lvm/volume.go
Outdated
| err = volbuilder.NewKubeclient().WithNamespace(LvmNamespace).Delete(volumeID) | ||
| if err == nil { | ||
| klog.Infof("deprovisioned volume %s", volumeID) | ||
| klog.Infof("deprovisioning volume %s, marking deletion timestamp", volumeID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it means the deletion timestamp is being set?
| "failed to handle delete volume request for {%s}", volumeID) | ||
| } | ||
| } | ||
| present := isNodePresent(vol.Spec.OwnerNodeID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if the finalizer is already removed? Then we don't need to run this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
finalizer is removed from the node-agent when lvremove succeeds. Here we will validate if node is not present. If not, finalizer will not be removed. so will remove finalizer from here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the lvm volume has additional finalisers (eg: added by user) but we've already removed ours, on a previous csi call, then we don't need to do this again
We only need to do this stage if our finaliser is present
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now this removes all Finalizers. I am using the function which is called by node-agent once lvremove is done.
https://github.com/openebs/lvm-localpv/blob/develop/pkg/lvm/volume.go#L217C1-L223C2
lvmvolume CR is managed by localpv-lvm. Users may not add any finalizers onto it.
|
|
||
| _, err := volbuilder.NewKubeclient().WithNamespace(LvmNamespace).Update(vol) | ||
| if err != nil { | ||
| klog.Infof("Finalizer removed successfully for %s", vol.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we already have the message from above where we call it if the node is not present?
Oh also this check seems reversed, if err != nil then we didn't remove the finalizer?
pkg/lvm/volume.go
Outdated
| err = volbuilder.NewKubeclient().WithNamespace(LvmNamespace).Delete(volumeID) | ||
| if err == nil { | ||
| klog.Infof("deprovisioned volume %s", volumeID) | ||
| klog.Infof("deprovisioning volume %s, marking deletion timestamp", volumeID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep that bit out since it's implied, but fine either way
| "failed to handle delete volume request for {%s}", volumeID) | ||
| } | ||
| } | ||
| present := isNodePresent(vol.Spec.OwnerNodeID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the lvm volume has additional finalisers (eg: added by user) but we've already removed ours, on a previous csi call, then we don't need to do this again
We only need to do this stage if our finaliser is present
pkg/driver/controller.go
Outdated
| if !present { | ||
| klog.Infof("Removing finalizer as node %s is not present in cluster", vol.Spec.OwnerNodeID) | ||
| if err = lvm.RemoveVolFinalizer(vol); err != nil { | ||
| return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we return err here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fail the DeleteVolume call? I guess that will be better.
da04e2d to
d766fd7
Compare
…e volume is not present Signed-off-by: Abhilash Shetty <[email protected]>
d766fd7 to
43a5f2c
Compare
When a Kubernetes node is removed from the cluster while it still has existing PVCs/LVMVolumes, those volumes become orphaned. If a PVC is deleted in this state, the LocalPV-LVM CSI controller marks the corresponding LVMVolume CR for deletion by adding a deletion timestamp.
Normally, the node-agent running on the worker node is responsible for cleaning up the logical volume (LV) and removing the finalizer from the CR. However, in this scenario, the node no longer exists, so there is no agent running to perform the cleanup.
This fix leverages the condition of the Kubernetes Node being absent: in such cases, the CSI controller itself removes the finalizer and cleans up the LVMVolume CR.
Fixes: #407