-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delay pod deletion to handle deregistraton delay #1775
Conversation
Welcome @foriequal0! |
Hi @foriequal0. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: foriequal0 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Codecov Report
@@ Coverage Diff @@
## main #1775 +/- ##
=======================================
Coverage 46.76% 46.76%
=======================================
Files 110 110
Lines 5988 5988
=======================================
Hits 2800 2800
Misses 2925 2925
Partials 263 263 Continue to review full report at Codecov.
|
Pods are deleted during Deployment rollout and then removed from Endpoints. When deleted, the controller deregisters from TargetGroups and then the ELB drains the traffic. The delay between Pod deletion and the propagation of draining is known to be the cause of ELB 5xx error. ELB tries to send traffic to Pods that are tereminated without knowing that they are terminated until Pods are completely deregistered. We solve this problem by keeping Pods alive for a delay time. There is a well-known solution that attaches a "sleep" command to preStop lifecycle hook to keep them alive for the delay. However, this is an ugly mitigation like duct tape, and it might be difficult to hook up a "sleep 9999" to some Deployments depending on their configurations. In this commit, we use ValidatingAdmissionWebhook to be notified about Pods that need be deleted, and block their immediate deletions. By doing so, we can do whatever we want with them while keeping them alive. We remove all labels from them, so they are removed from the Endpoints, and are deregistered from TargetGroups. We also remove ownerReferences so it doesn't kick GC in. Most importantly, they are still alive so they can continue to serve incoming new traffics received during the delay. After letting them alive for the delay, we can safely delete them as there shouldn't be any new traffic coming in.
I think we shouldn't remove all labels. Should I remove only labels that are defined by the service's selector? Also shoud I add support an option to delay them until the draining is fully finished (deregistration_delay. timeout_seconds?) if the app is not support connection draining? Anyway, it works for me perfectly at this point. |
I found that I'm thinking this is more hackisher than I thought. I should make this as a separate project. |
Came here after several days of trying to configure the this thing JUST SO and still getting handfulls of 500s on every deploy. Thank you for this work, I really hope a clean way to make this work can be figured out. Of course, |
@mihasha I've made this a seperated package here https://github.com/foriequal0/pod-graceful-drain |
Pods are deleted during Deployment rollout and then removed from
Endpoints. When deleted, the controller deregisters from TargetGroups
and then the ELB drains the traffic. The delay between Pod deletion and
the propagation of draining is known to be the cause of ELB 5xx error.
ELB tries to send traffic to Pods that are tereminated without knowing
that they are terminated until Pods are completely deregistered.
We solve this problem by keeping Pods alive for a delay time. There is a
well-known solution that attaches a "sleep" command to preStop lifecycle
hook to keep them alive for the delay. However, this is an ugly
mitigation like duct tape, and it might be difficult to hook up a "sleep
9999" to some Deployments depending on their configurations.
In this commit, we use ValidatingAdmissionWebhook to be notified about
Pods that need be deleted, and block their immediate deletions. By doing
so, we can do whatever we want with them while keeping them alive. We
remove all labels from them, so they are removed from the Endpoints, and
are deregistered from TargetGroups. We also remove ownerReferences so it
doesn't kick GC in. Most importantly, they are still alive so they can
continue to serve incoming new traffics received during the delay. After
letting them alive for the delay, we can safely delete them as there
shouldn't be any new traffic coming in.
You can test with the prebuilt docker image: https://github.com/users/foriequal0/packages/container/aws-load-balancer-controller/901501 I've also forked eks-charts/aws-load-balancer-controller https://github.com/foriequal0/eks-charts/tree/drain/stable/aws-load-balancer-controller You can use this helm chart value: ```yaml podGracefulDrainDelay: 90s image: repository: ghcr.io/foriequal0/aws-load-balancer-controller tag: v2.1.1-2-geb716265 ```Please see https://github.com/foriequal0/pod-graceful-drain
closes: #1719
closes: #1065
related?: #1403