Routes are not cleaned after scale down/node removal via cluster-autoscaler #734

pat-s · 2024-09-02T13:10:31Z

TL;DR

See title.

Expected behavior

Routes are removed again after the node is deleted.

Observed behavior

Routes are not removed and accumulate in the account, leading to node startup failures when the route rate limit is hit (100?).

Minimal working example

No response

Log output

No response

Additional information

Initially posted in kubernetes/autoscaler#7227

apricote · 2024-09-03T05:40:08Z

Hey @pat-s,

could you post the logs of hcloud-cloud-controller-manager and its configuration? Especially the networking part of the configuration is of interest.

pat-s · 2024-09-03T19:30:34Z

v1.20.0 running with

     Command:                                                                                                                                                                                             │
│       /bin/hcloud-cloud-controller-manager                                                                                                                                                               │
│       --cloud-provider=hcloud                                                                                                                                                                            │
│       --leader-elect=false                                                                                                                                                                               │
│       --allow-untagged-cloud                                                                                                                                                                             │
│       --allocate-node-cidrs=true                                                                                                                                                                         │
│       --cluster-cidr=10.42.0.0/16                                                                                                                                                                        │
│       --webhook-secure-port=0                                                                                                                                                                            │
│       --secure-port=10288                                                                                                                                                                                │
│     Args:                                                                                                                                                                                                │
│       --allow-untagged-cloud                                                                                                                                                                             │
│       --cloud-provider=hcloud                                                                                                                                                                            │
│       --route-reconciliation-period=30s                                                                                                                                                                  │
│       --webhook-secure-port=0                                                                                                                                                                            │
│       --allocate-node-cidrs=true                                                                                                                                                                         │
│       --cluster-cidr=10.244.0.0/16                                                                                                                                                                       │
│       --leader-elect=false

Running on a k3s cluster deployed with terraform-hcloud-kube-hetzner.

k8s version: 1.29.8

fatelgit · 2024-09-10T07:29:38Z

+1 We are hitting the 100 routes limit as well which seems to be related to a lot of node scaling events.
Some log output:

I0910 07:31:35.984663       1 route_controller.go:216] action for Node "k3s-autoscaled-cx32-nbg1-43ef964b133bcc7a" with CIDR "10.42.98.0/24": "keep"
I0910 07:31:35.984710       1 route_controller.go:216] action for Node "k3s-autoscaled-cx32-nbg1-4cdf23e2d9958bc0" with CIDR "10.42.8.0/24": "keep"
I0910 07:31:35.984724       1 route_controller.go:216] action for Node "k3s-autoscaled-cx32-nbg1-5797432f74cfeeb7" with CIDR "10.42.173.0/24": "add"
I0910 07:31:35.984732       1 route_controller.go:216] action for Node "k3s-autoscaled-cx32-nbg1-f976ab63fe71c24" with CIDR "10.42.96.0/24": "keep"
I0910 07:31:35.984740       1 route_controller.go:216] action for Node "k3s-agent-cx32-nbg1-xiq" with CIDR "10.42.4.0/24": "keep"
I0910 07:31:35.984747       1 route_controller.go:216] action for Node "k3s-autoscaled-cx32-nbg1-639592727b666e7a" with CIDR "10.42.1.0/24": "keep"
I0910 07:31:35.984753       1 route_controller.go:216] action for Node "k3s-control-plane-fsn1-cax21-aui" with CIDR "10.42.2.0/24": "keep"
I0910 07:31:35.984760       1 route_controller.go:216] action for Node "k3s-control-plane-fsn1-cax21-dec" with CIDR "10.42.3.0/24": "keep"
I0910 07:31:35.984766       1 route_controller.go:216] action for Node "k3s-control-plane-fsn1-cax21-glo" with CIDR "10.42.0.0/24": "keep"
I0910 07:31:35.984773       1 route_controller.go:216] action for Node "k3s-autoscaled-cx32-nbg1-27c56399330a466b" with CIDR "10.42.175.0/24": "add"
I0910 07:31:35.984841       1 route_controller.go:290] route spec to be created: &{ k3s-autoscaled-cx32-nbg1-5797432f74cfeeb7 false [{InternalIP 10.255.0.3} {Hostname k3s-autoscaled-cx32-nbg1-5797432f74cfeeb7} {ExternalIP 5.75.159.156}] 10.42.173.0/24 false}
I0910 07:31:35.984882       1 route_controller.go:290] route spec to be created: &{ k3s-autoscaled-cx32-nbg1-27c56399330a466b false [{InternalIP 10.255.0.4} {Hostname k3s-autoscaled-cx32-nbg1-27c56399330a466b} {ExternalIP 116.203.22.3}] 10.42.175.0/24 false}
I0910 07:31:35.984920       1 route_controller.go:304] Creating route for node k3s-autoscaled-cx32-nbg1-27c56399330a466b 10.42.175.0/24 with hint 37d3ec76-e77e-4059-bced-213b30b18df8, throttled 16.92µs
I0910 07:31:35.985003       1 route_controller.go:304] Creating route for node k3s-autoscaled-cx32-nbg1-5797432f74cfeeb7 10.42.173.0/24 with hint fff69d03-79ca-4233-ac69-2494b227cb68, throttled 19.96µs
E0910 07:31:36.071533       1 route_controller.go:329] Could not create route fff69d03-79ca-4233-ac69-2494b227cb68 10.42.173.0/24 for node k3s-autoscaled-cx32-nbg1-5797432f74cfeeb7: hcloud/CreateRoute: route limit reached (forbidden)
I0910 07:31:36.071640       1 event.go:389] "Event occurred" object="k3s-autoscaled-cx32-nbg1-5797432f74cfeeb7" fieldPath="" kind="Node" apiVersion="" type="Warning" reason="FailedToCreateRoute" message="Could not create route fff69d03-79ca-4233-ac69-2494b227cb68 10.42.173.0/24 for node k3s-autoscaled-cx32-nbg1-5797432f74cfeeb7 after 86.536869ms: hcloud/CreateRoute: route limit reached (forbidden)"
E0910 07:31:36.091213       1 route_controller.go:329] Could not create route 37d3ec76-e77e-4059-bced-213b30b18df8 10.42.175.0/24 for node k3s-autoscaled-cx32-nbg1-27c56399330a466b: hcloud/CreateRoute: route limit reached (forbidden)
I0910 07:31:36.091298       1 route_controller.go:387] Patching node status k3s-autoscaled-cx32-nbg1-27c56399330a466b with false previous condition was:&NodeCondition{Type:NetworkUnavailable,Status:False,LastHeartbeatTime:2024-09-10 07:31:06 +0000 UTC,LastTransitionTime:2024-09-10 07:31:06 +0000 UTC,Reason:CiliumIsUp,Message:Cilium is running on this node,}
I0910 07:31:36.091576       1 event.go:389] "Event occurred" object="k3s-autoscaled-cx32-nbg1-27c56399330a466b" fieldPath="" kind="Node" apiVersion="" type="Warning" reason="FailedToCreateRoute" message="Could not create route 37d3ec76-e77e-4059-bced-213b30b18df8 10.42.175.0/24 for node k3s-autoscaled-cx32-nbg1-27c56399330a466b after 106.265644ms: hcloud/CreateRoute: route limit reached (forbidden)"
I0910 07:31:36.091586       1 route_controller.go:387] Patching node status k3s-autoscaled-cx32-nbg1-5797432f74cfeeb7 with false previous condition was:&NodeCondition{Type:NetworkUnavailable,Status:False,LastHeartbeatTime:2024-09-10 07:31:06 +0000 UTC,LastTransitionTime:2024-09-10 07:31:06 +0000 UTC,Reason:CiliumIsUp,Message:Cilium is running on this node,}

Is there a way to reset routes manually? Or a way to figure out which routes are really in use?

fatelgit · 2024-09-11T05:55:48Z

I just deleted about 30 routes for a node with internal IP 10.255.0.4 which has been removed hours ago. So I checked the logs for this node:

I0911 05:29:19.036441       1 route_controller.go:290] route spec to be created: &{ k3s-autoscaled-cx32-nbg1-26bc4cf733a14b8 false [{InternalIP 10.255.0.4} {Hostname k3s-autoscaled-cx32-nbg1-26bc4cf733a14b8} {ExternalIP 116.203.22.3}] 10.42.205.0/24 false}
I0911 05:29:19.036504       1 route_controller.go:304] Creating route for node k3s-autoscaled-cx32-nbg1-26bc4cf733a14b8 10.42.205.0/24 with hint 8a63737f-26ce-4f88-9e75-1598c89f8c68, throttled 14.64µs
E0911 05:29:19.036554       1 route_controller.go:329] Could not create route 8a63737f-26ce-4f88-9e75-1598c89f8c68 10.42.205.0/24 for node k3s-autoscaled-cx32-nbg1-26bc4cf733a14b8: hcloud/CreateRoute: hcops/AllServersCache.ByName: k3s-autoscaled-cx32-nbg1-26bc4cf733a14b8 hcops/AllServersCache.getCache: not found
I0911 05:29:19.036591       1 route_controller.go:387] Patching node status k3s-autoscaled-cx32-nbg1-26bc4cf733a14b8 with false previous condition was:&NodeCondition{Type:NetworkUnavailable,Status:False,LastHeartbeatTime:2024-09-11 04:07:14 +0000 UTC,LastTransitionTime:2024-09-11 04:07:14 +0000 UTC,Reason:CiliumIsUp,Message:Cilium is running on this node,}
I0911 05:29:19.036655       1 event.go:389] "Event occurred" object="k3s-autoscaled-cx32-nbg1-26bc4cf733a14b8" fieldPath="" kind="Node" apiVersion="" type="Warning" reason="FailedToCreateRoute" message="Could not create route 8a63737f-26ce-4f88-9e75-1598c89f8c68 10.42.205.0/24 for node k3s-autoscaled-cx32-nbg1-26bc4cf733a14b8 after 44.32µs: hcloud/CreateRoute: hcops/AllServersCache.ByName: k3s-autoscaled-cx32-nbg1-26bc4cf733a14b8 hcops/AllServersCache.getCache: not found"
I0911 05:29:27.867206       1 event.go:389] "Event occurred" object="k3s-autoscaled-cx32-nbg1-26bc4cf733a14b8" fieldPath="" kind="Node" apiVersion="" type="Normal" reason="DeletingNode" message="Deleting node k3s-autoscaled-cx32-nbg1-26bc4cf733a14b8 because it does not exist in the cloud provider"
I0911 05:29:27.924289       1 load_balancers.go:281] "update Load Balancer" op="hcloud/loadBalancers.UpdateLoadBalancer" service="traefik" nodes=["k3s-autoscaled-cx32-nbg1-f976ab63fe71c24","k3s-agent-cx32-nbg1-xiq","k3s-autoscaled-cx32-nbg1-6e1b0889ee0036f2","k3s-autoscaled-cx32-nbg1-43ef964b133bcc7a","k3s-autoscaled-cx32-nbg1-639592727b666e7a","k3s-autoscaled-cx32-nbg1-4cdf23e2d9958bc0"]
I0911 05:29:29.037883       1 load_balancer.go:850] "update service" op="hcops/LoadBalancerOps.ReconcileHCLBServices" port=80 loadBalancerID=1422790
I0911 05:29:30.793144       1 load_balancer.go:850] "update service" op="hcops/LoadBalancerOps.ReconcileHCLBServices" port=443 loadBalancerID=1422790
I0911 05:29:32.514155       1 event.go:389] "Event occurred" object="traefik/traefik" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="UpdatedLoadBalancer" message="Updated load balancer with new hosts"

Which shows that there is an event when the node is deleting but the routes are still present.

jooola · 2024-09-11T07:25:40Z

@fatelgit @pat-s Please open a support ticket on the cloud console https://console.hetzner.cloud/support so we can fix this issue.

apricote · 2024-09-11T07:54:46Z

As @jooola wrote, if you open an actual support ticket we can use our internal support panels to gain more insights into your projects and see what is happening on our side.

I do think I found the bug without any additional info though.

In the configuration @pat-s posted, you can see that the flag --cluster-cidr= is specfied twice. Once in the command (10.42.0.0/16) and once in the args (10.244.0.0/16). From a quick local test, it seems that the last flag wins, so the one in args

Based on the logs @fatelgit posted, it seems like the cluster is configured to assign Node Pod CIDRs in the 10.42.0.0/16 range, which we then use to create routes.

But HCCM only removes routes from the range specified in the --cluster-cidr flag, which is 10.244.0.0/16.

This mismatch leads to the previous routes not being cleaned up. You should change your hcloud-cloud-controller-manager configuration to only have the correct flag for your cluster setup.

If I find the time today, I will open an issue with kube-hetzner to explain the problem. But feel free to open one yourself if you are quicker than me or I dont get to it today.

apricote · 2024-09-11T07:55:43Z

This is explained in our docs: https://github.com/hetznercloud/hcloud-cloud-controller-manager/blob/main/docs/deploy_with_networks.md#considerations-on-the-ip-ranges

pat-s · 2024-09-11T08:50:30Z

@apricote Opened a support ticket.

I found 10.244.0.0/16 being hardcoded in

hcloud-cloud-controller-manager/deploy/ccm-networks.yaml

Line 69 in cb91679

- "--cluster-cidr=10.244.0.0/16"

.
My cluster CIDR is in fact 10.42.0.0/16 and if it gets overwritten by 10.244.0.0/16, then I understand why the removal is not working.

I looks like that the config sent by kube-hetzner is fully parsed as the Command: whereas it should likely be sent as Command and Args and with that overwrite the defaults of HCCM?

pat-s · 2024-09-11T08:54:00Z

It seems like this change here might be responsible: 2ba4058

Maybe updating https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner/blob/master/templates/ccm.yaml.tpl to align with the recent changes might already do it?

apricote · 2024-09-11T12:51:57Z

I opened kube-hetzner/terraform-hcloud-kube-hetzner#1477

In general, this is not an issue with hcloud-cloud-controller-manager but rather a misconfiguration by the user (through kube-hetzner). Hetzner does not provide official support for this.

pat-s added the bug Something isn't working label Sep 2, 2024

apricote self-assigned this Sep 11, 2024

apricote mentioned this issue Sep 11, 2024

[Bug]: Network routes are not cleaned up by HCCM kube-hetzner/terraform-hcloud-kube-hetzner#1477

Open

apricote removed the bug Something isn't working label Sep 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Routes are not cleaned after scale down/node removal via cluster-autoscaler #734

Routes are not cleaned after scale down/node removal via cluster-autoscaler #734

pat-s commented Sep 2, 2024

apricote commented Sep 3, 2024

pat-s commented Sep 3, 2024

fatelgit commented Sep 10, 2024 •

edited

Loading

fatelgit commented Sep 11, 2024

jooola commented Sep 11, 2024 •

edited

Loading

apricote commented Sep 11, 2024

apricote commented Sep 11, 2024

pat-s commented Sep 11, 2024

pat-s commented Sep 11, 2024

apricote commented Sep 11, 2024

Routes are not cleaned after scale down/node removal via cluster-autoscaler #734

Routes are not cleaned after scale down/node removal via cluster-autoscaler #734

Comments

pat-s commented Sep 2, 2024

TL;DR

Expected behavior

Observed behavior

Minimal working example

Log output

Additional information

apricote commented Sep 3, 2024

pat-s commented Sep 3, 2024

fatelgit commented Sep 10, 2024 • edited Loading

fatelgit commented Sep 11, 2024

jooola commented Sep 11, 2024 • edited Loading

apricote commented Sep 11, 2024

apricote commented Sep 11, 2024

pat-s commented Sep 11, 2024

pat-s commented Sep 11, 2024

apricote commented Sep 11, 2024

fatelgit commented Sep 10, 2024 •

edited

Loading

jooola commented Sep 11, 2024 •

edited

Loading