-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate Karpenter with AWS Health Checks (EC2 instances, EBS volumes, etc.) #7634
Comments
Would node repair address your issue here? https://docs.aws.amazon.com/eks/latest/userguide/node-health.html That said, I don't believe that it was back ported or that there are plans to do so. Karpenter responds to these events starting in v1.1. @engedaam to confirm. |
This issue has been inactive for 14 days. StaleBot will close this stale issue after 14 more days of inactivity. |
Outside of NodeRepair, I think we should check if there are gaps in NodeRepair that would cause us to need to periodically poll instances for health checks. It shouldn't be particularly difficult to hook into these checks, it's mostly a question about whether this is necessary or not given the NodeRepair feature. One thing that I could maybe think is that a health check failure on the EC2 instance could be acted on much quicker than the optimistic NotReady check that requires waiting 30m before terminating the node. Also, marking this as a feature because (to me) this is about doing a health check integration for EC2 instances in the AWS provider. |
Description
Observed Behavior: We noticed that a node in EC2 is not reachable and failing health checks but karpenter is not terminating the node
Expected Behavior: Karpenter should terminate the node if it is not reachable
Reproduction Steps (Please include YAML): Not sure how since the node failed health checks
Terminating
stateVersions:
kubectl version
): 1.29.8The text was updated successfully, but these errors were encountered: