Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Routes are not cleaned after scale down/node removal via cluster-autoscaler #734

Open
pat-s opened this issue Sep 2, 2024 · 10 comments
Open
Assignees

Comments

@pat-s
Copy link

pat-s commented Sep 2, 2024

TL;DR

See title.

Expected behavior

Routes are removed again after the node is deleted.

Observed behavior

Routes are not removed and accumulate in the account, leading to node startup failures when the route rate limit is hit (100?).

Minimal working example

No response

Log output

No response

Additional information

Initially posted in kubernetes/autoscaler#7227

@pat-s pat-s added the bug Something isn't working label Sep 2, 2024
@apricote
Copy link
Member

apricote commented Sep 3, 2024

Hey @pat-s,

could you post the logs of hcloud-cloud-controller-manager and its configuration? Especially the networking part of the configuration is of interest.

@pat-s
Copy link
Author

pat-s commented Sep 3, 2024

v1.20.0 running with

     Command:                                                                                                                                                                                             │
│       /bin/hcloud-cloud-controller-manager                                                                                                                                                               │
│       --cloud-provider=hcloud                                                                                                                                                                            │
│       --leader-elect=false                                                                                                                                                                               │
│       --allow-untagged-cloud                                                                                                                                                                             │
│       --allocate-node-cidrs=true                                                                                                                                                                         │
│       --cluster-cidr=10.42.0.0/16                                                                                                                                                                        │
│       --webhook-secure-port=0                                                                                                                                                                            │
│       --secure-port=10288                                                                                                                                                                                │
│     Args:                                                                                                                                                                                                │
│       --allow-untagged-cloud                                                                                                                                                                             │
│       --cloud-provider=hcloud                                                                                                                                                                            │
│       --route-reconciliation-period=30s                                                                                                                                                                  │
│       --webhook-secure-port=0                                                                                                                                                                            │
│       --allocate-node-cidrs=true                                                                                                                                                                         │
│       --cluster-cidr=10.244.0.0/16                                                                                                                                                                       │
│       --leader-elect=false

Running on a k3s cluster deployed with terraform-hcloud-kube-hetzner.

k8s version: 1.29.8

@fatelgit
Copy link

fatelgit commented Sep 10, 2024

+1 We are hitting the 100 routes limit as well which seems to be related to a lot of node scaling events.
Some log output:

I0910 07:31:35.984663       1 route_controller.go:216] action for Node "k3s-autoscaled-cx32-nbg1-43ef964b133bcc7a" with CIDR "10.42.98.0/24": "keep"
I0910 07:31:35.984710       1 route_controller.go:216] action for Node "k3s-autoscaled-cx32-nbg1-4cdf23e2d9958bc0" with CIDR "10.42.8.0/24": "keep"
I0910 07:31:35.984724       1 route_controller.go:216] action for Node "k3s-autoscaled-cx32-nbg1-5797432f74cfeeb7" with CIDR "10.42.173.0/24": "add"
I0910 07:31:35.984732       1 route_controller.go:216] action for Node "k3s-autoscaled-cx32-nbg1-f976ab63fe71c24" with CIDR "10.42.96.0/24": "keep"
I0910 07:31:35.984740       1 route_controller.go:216] action for Node "k3s-agent-cx32-nbg1-xiq" with CIDR "10.42.4.0/24": "keep"
I0910 07:31:35.984747       1 route_controller.go:216] action for Node "k3s-autoscaled-cx32-nbg1-639592727b666e7a" with CIDR "10.42.1.0/24": "keep"
I0910 07:31:35.984753       1 route_controller.go:216] action for Node "k3s-control-plane-fsn1-cax21-aui" with CIDR "10.42.2.0/24": "keep"
I0910 07:31:35.984760       1 route_controller.go:216] action for Node "k3s-control-plane-fsn1-cax21-dec" with CIDR "10.42.3.0/24": "keep"
I0910 07:31:35.984766       1 route_controller.go:216] action for Node "k3s-control-plane-fsn1-cax21-glo" with CIDR "10.42.0.0/24": "keep"
I0910 07:31:35.984773       1 route_controller.go:216] action for Node "k3s-autoscaled-cx32-nbg1-27c56399330a466b" with CIDR "10.42.175.0/24": "add"
I0910 07:31:35.984841       1 route_controller.go:290] route spec to be created: &{ k3s-autoscaled-cx32-nbg1-5797432f74cfeeb7 false [{InternalIP 10.255.0.3} {Hostname k3s-autoscaled-cx32-nbg1-5797432f74cfeeb7} {ExternalIP 5.75.159.156}] 10.42.173.0/24 false}
I0910 07:31:35.984882       1 route_controller.go:290] route spec to be created: &{ k3s-autoscaled-cx32-nbg1-27c56399330a466b false [{InternalIP 10.255.0.4} {Hostname k3s-autoscaled-cx32-nbg1-27c56399330a466b} {ExternalIP 116.203.22.3}] 10.42.175.0/24 false}
I0910 07:31:35.984920       1 route_controller.go:304] Creating route for node k3s-autoscaled-cx32-nbg1-27c56399330a466b 10.42.175.0/24 with hint 37d3ec76-e77e-4059-bced-213b30b18df8, throttled 16.92µs
I0910 07:31:35.985003       1 route_controller.go:304] Creating route for node k3s-autoscaled-cx32-nbg1-5797432f74cfeeb7 10.42.173.0/24 with hint fff69d03-79ca-4233-ac69-2494b227cb68, throttled 19.96µs
E0910 07:31:36.071533       1 route_controller.go:329] Could not create route fff69d03-79ca-4233-ac69-2494b227cb68 10.42.173.0/24 for node k3s-autoscaled-cx32-nbg1-5797432f74cfeeb7: hcloud/CreateRoute: route limit reached (forbidden)
I0910 07:31:36.071640       1 event.go:389] "Event occurred" object="k3s-autoscaled-cx32-nbg1-5797432f74cfeeb7" fieldPath="" kind="Node" apiVersion="" type="Warning" reason="FailedToCreateRoute" message="Could not create route fff69d03-79ca-4233-ac69-2494b227cb68 10.42.173.0/24 for node k3s-autoscaled-cx32-nbg1-5797432f74cfeeb7 after 86.536869ms: hcloud/CreateRoute: route limit reached (forbidden)"
E0910 07:31:36.091213       1 route_controller.go:329] Could not create route 37d3ec76-e77e-4059-bced-213b30b18df8 10.42.175.0/24 for node k3s-autoscaled-cx32-nbg1-27c56399330a466b: hcloud/CreateRoute: route limit reached (forbidden)
I0910 07:31:36.091298       1 route_controller.go:387] Patching node status k3s-autoscaled-cx32-nbg1-27c56399330a466b with false previous condition was:&NodeCondition{Type:NetworkUnavailable,Status:False,LastHeartbeatTime:2024-09-10 07:31:06 +0000 UTC,LastTransitionTime:2024-09-10 07:31:06 +0000 UTC,Reason:CiliumIsUp,Message:Cilium is running on this node,}
I0910 07:31:36.091576       1 event.go:389] "Event occurred" object="k3s-autoscaled-cx32-nbg1-27c56399330a466b" fieldPath="" kind="Node" apiVersion="" type="Warning" reason="FailedToCreateRoute" message="Could not create route 37d3ec76-e77e-4059-bced-213b30b18df8 10.42.175.0/24 for node k3s-autoscaled-cx32-nbg1-27c56399330a466b after 106.265644ms: hcloud/CreateRoute: route limit reached (forbidden)"
I0910 07:31:36.091586       1 route_controller.go:387] Patching node status k3s-autoscaled-cx32-nbg1-5797432f74cfeeb7 with false previous condition was:&NodeCondition{Type:NetworkUnavailable,Status:False,LastHeartbeatTime:2024-09-10 07:31:06 +0000 UTC,LastTransitionTime:2024-09-10 07:31:06 +0000 UTC,Reason:CiliumIsUp,Message:Cilium is running on this node,}

Is there a way to reset routes manually? Or a way to figure out which routes are really in use?

@fatelgit
Copy link

I just deleted about 30 routes for a node with internal IP 10.255.0.4 which has been removed hours ago. So I checked the logs for this node:

I0911 05:29:19.036441       1 route_controller.go:290] route spec to be created: &{ k3s-autoscaled-cx32-nbg1-26bc4cf733a14b8 false [{InternalIP 10.255.0.4} {Hostname k3s-autoscaled-cx32-nbg1-26bc4cf733a14b8} {ExternalIP 116.203.22.3}] 10.42.205.0/24 false}
I0911 05:29:19.036504       1 route_controller.go:304] Creating route for node k3s-autoscaled-cx32-nbg1-26bc4cf733a14b8 10.42.205.0/24 with hint 8a63737f-26ce-4f88-9e75-1598c89f8c68, throttled 14.64µs
E0911 05:29:19.036554       1 route_controller.go:329] Could not create route 8a63737f-26ce-4f88-9e75-1598c89f8c68 10.42.205.0/24 for node k3s-autoscaled-cx32-nbg1-26bc4cf733a14b8: hcloud/CreateRoute: hcops/AllServersCache.ByName: k3s-autoscaled-cx32-nbg1-26bc4cf733a14b8 hcops/AllServersCache.getCache: not found
I0911 05:29:19.036591       1 route_controller.go:387] Patching node status k3s-autoscaled-cx32-nbg1-26bc4cf733a14b8 with false previous condition was:&NodeCondition{Type:NetworkUnavailable,Status:False,LastHeartbeatTime:2024-09-11 04:07:14 +0000 UTC,LastTransitionTime:2024-09-11 04:07:14 +0000 UTC,Reason:CiliumIsUp,Message:Cilium is running on this node,}
I0911 05:29:19.036655       1 event.go:389] "Event occurred" object="k3s-autoscaled-cx32-nbg1-26bc4cf733a14b8" fieldPath="" kind="Node" apiVersion="" type="Warning" reason="FailedToCreateRoute" message="Could not create route 8a63737f-26ce-4f88-9e75-1598c89f8c68 10.42.205.0/24 for node k3s-autoscaled-cx32-nbg1-26bc4cf733a14b8 after 44.32µs: hcloud/CreateRoute: hcops/AllServersCache.ByName: k3s-autoscaled-cx32-nbg1-26bc4cf733a14b8 hcops/AllServersCache.getCache: not found"
I0911 05:29:27.867206       1 event.go:389] "Event occurred" object="k3s-autoscaled-cx32-nbg1-26bc4cf733a14b8" fieldPath="" kind="Node" apiVersion="" type="Normal" reason="DeletingNode" message="Deleting node k3s-autoscaled-cx32-nbg1-26bc4cf733a14b8 because it does not exist in the cloud provider"
I0911 05:29:27.924289       1 load_balancers.go:281] "update Load Balancer" op="hcloud/loadBalancers.UpdateLoadBalancer" service="traefik" nodes=["k3s-autoscaled-cx32-nbg1-f976ab63fe71c24","k3s-agent-cx32-nbg1-xiq","k3s-autoscaled-cx32-nbg1-6e1b0889ee0036f2","k3s-autoscaled-cx32-nbg1-43ef964b133bcc7a","k3s-autoscaled-cx32-nbg1-639592727b666e7a","k3s-autoscaled-cx32-nbg1-4cdf23e2d9958bc0"]
I0911 05:29:29.037883       1 load_balancer.go:850] "update service" op="hcops/LoadBalancerOps.ReconcileHCLBServices" port=80 loadBalancerID=1422790
I0911 05:29:30.793144       1 load_balancer.go:850] "update service" op="hcops/LoadBalancerOps.ReconcileHCLBServices" port=443 loadBalancerID=1422790
I0911 05:29:32.514155       1 event.go:389] "Event occurred" object="traefik/traefik" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="UpdatedLoadBalancer" message="Updated load balancer with new hosts"

Which shows that there is an event when the node is deleting but the routes are still present.

@jooola
Copy link
Member

jooola commented Sep 11, 2024

@fatelgit @pat-s Please open a support ticket on the cloud console https://console.hetzner.cloud/support so we can fix this issue.

@apricote
Copy link
Member

As @jooola wrote, if you open an actual support ticket we can use our internal support panels to gain more insights into your projects and see what is happening on our side.


I do think I found the bug without any additional info though.

In the configuration @pat-s posted, you can see that the flag --cluster-cidr= is specfied twice. Once in the command (10.42.0.0/16) and once in the args (10.244.0.0/16). From a quick local test, it seems that the last flag wins, so the one in args

Based on the logs @fatelgit posted, it seems like the cluster is configured to assign Node Pod CIDRs in the 10.42.0.0/16 range, which we then use to create routes.

But HCCM only removes routes from the range specified in the --cluster-cidr flag, which is 10.244.0.0/16.

This mismatch leads to the previous routes not being cleaned up. You should change your hcloud-cloud-controller-manager configuration to only have the correct flag for your cluster setup.

If I find the time today, I will open an issue with kube-hetzner to explain the problem. But feel free to open one yourself if you are quicker than me or I dont get to it today.

@apricote apricote self-assigned this Sep 11, 2024
@apricote
Copy link
Member

@pat-s
Copy link
Author

pat-s commented Sep 11, 2024

@apricote Opened a support ticket.

I found 10.244.0.0/16 being hardcoded in

- "--cluster-cidr=10.244.0.0/16"
.
My cluster CIDR is in fact 10.42.0.0/16 and if it gets overwritten by 10.244.0.0/16, then I understand why the removal is not working.

I looks like that the config sent by kube-hetzner is fully parsed as the Command: whereas it should likely be sent as Command and Args and with that overwrite the defaults of HCCM?

@pat-s
Copy link
Author

pat-s commented Sep 11, 2024

It seems like this change here might be responsible: 2ba4058

Maybe updating https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner/blob/master/templates/ccm.yaml.tpl to align with the recent changes might already do it?

@apricote
Copy link
Member

I opened kube-hetzner/terraform-hcloud-kube-hetzner#1477

In general, this is not an issue with hcloud-cloud-controller-manager but rather a misconfiguration by the user (through kube-hetzner). Hetzner does not provide official support for this.

@apricote apricote removed the bug Something isn't working label Sep 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants