Fairer allocation of targets to target groups when quota exceeded #4025
Labels
good first issue
Denotes an issue ready for a new contributor, according to the "help wanted" guidelines.
kind/feature
Categorizes issue or PR as related to a new feature.
Describe the bug
Note: This may not always be considered a bug but there may be an improvement in behaviour for this abnormal condition.
The condition is hit when the AWS quota limit is reached for the number of targets. As we slowly scale up a k8s cluster that is the target for an NLB with multiple ports, each time a node is added it will be added (+1) to each of the target groups/ports. As we are targeting the Instance, there is a target in each target group (TG) for every node. If we now exceed the quota for targets, so long as we have added the nodes one-by-one we would have roughly an even number of targets in each target group and 'things would look ok'.
Now if we have exceed the quota x4 (easily done when you have a target group multiplier) and we then start to replace nodes (ASG image replacement for example), the aws-loadbalancer-controller will fill up the first target group, then the second, then the third, and so on until the quota is exceeded. If the quota is exceeded when filling up the second target group for example, the third and the fourth will be starved of targets and no traffic will be served for those ports.
For example, a cluster with 250 nodes and an NLB with 4 ports will end up with 250 targets in TG1, 250 in TG2, 0 in TG3 and 0 TG4 if the quota is 500 targets.
A better/fairer way would be to add a new target across all target groups before adding the next target. This would result in a fairer allocation of the available target quota and prevent potential outages. The above example would then look like:
125 in TG1, 125 in TG2, 125 in TG3 and 125 TG4.
Steps to reproduce
Deploy an NLB with multiple ports/target groups, for example 5. Scale the cluster nodes up when targeting Instances from the LoadBalancer then replace each node in the ASG. The LB controller will allocate to the first target-groups first until the quota is exceeded and then starve the rest.
Expected outcome
Targets should be added evenly across target-groups so that even when the quota is exceeded there are an equal number of nodes in each target group.
Environment
Additional Context:
The text was updated successfully, but these errors were encountered: