-
-
Notifications
You must be signed in to change notification settings - Fork 154
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #527 from vitobotta/masters-different-locations
Masters in different locations
- Loading branch information
Showing
20 changed files
with
276 additions
and
72 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
# Masters in Different Locations | ||
|
||
You can set up a regional cluster for maximum availability by placing each master in a different European location. This means the first master will be in Falkenstein, the second in Helsinki, and the third in Nuremberd (listed in alphabetical order). This setup is only possible in network zones with multiple locations, and currently, the only such zone is `eu-central`, which includes these three European locations. For other regions, only zonal clusters are supported. Additionally, regional clusters are limited to 3 masters because we only have these three locations available. | ||
|
||
To create a regional cluster, simply set the `instance_count` for the masters pool to 3 and specify the `locations` setting as `fsn1`, `hel1`, and `nbg1`. | ||
|
||
## Converting a Single Master or Zonal Cluster to a Regional One | ||
|
||
If you already have a cluster with a single master or three masters in the same European location, converting it to a regional cluster is straightforward. Just follow these steps carefully and be patient. Note that this requires hetzner-k3s version 2.2.3 or higher. | ||
|
||
Before you begin, make sure to back up all your applications and data! This is crucial. While the migration process is relatively simple, there is always some level of risk involved. | ||
|
||
- [ ] Set the `instance_type` for the masters pool to 3 if your cluster currently has only one master. | ||
- [ ] Update the `locations` setting for the masters pool to include `fns1`, `hel1`, and `nbg1` like this: | ||
|
||
```yaml | ||
locations: | ||
- fns1 | ||
- hel1 | ||
- nbg1 | ||
``` | ||
The locations are always processed in alphabetical order, regardless of how you list them in the `locations` property. This ensures consistency, especially when replacing a master due to node failure or other issues. | ||
|
||
- [ ] If your cluster currently has a single master, run the `create` command with the updated configuration. This will create `master2` in Helsinki and `master3` in Nuremberg. Wait for the operation to complete and confirm that all three masters are in a ready state. | ||
- [ ] If `master1` is not in Falkenstein (fns1): | ||
- Drain `master1`. | ||
- Delete `master1` using the command `kubectl delete node {cluster-name}-master1`. | ||
- Remove the `master1` instance via the Hetzner Console or the `hcloud` utility (see: https://github.com/hetznercloud/cli). | ||
- Run the `create` command again. This will recreate `master1` in Falkenstein. | ||
- SSH into each master and run the following commands to ensure `master1` has joined the cluster correctly: | ||
|
||
```bash | ||
sudo apt-get update | ||
sudo apt-get install etcd-client | ||
export ETCDCTL_API=3 | ||
export ETCDCTL_ENDPOINTS=https://127.0.0.1:2379 | ||
export ETCDCTL_CACERT=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt | ||
export ETCDCTL_CERT=/var/lib/rancher/k3s/server/tls/etcd/server-client.crt | ||
export ETCDCTL_KEY=/var/lib/rancher/k3s/server/tls/etcd/server-client.key | ||
etcdctl member list | ||
``` | ||
|
||
The last command should display something like this if everything is working properly: | ||
|
||
``` | ||
285ab4b980c2c8c, started, test-master2-d25722af, https://10.0.0.3:2380, https://10.0.0.3:2379, false | ||
aad3fac89b68bfb7, started, test-master1-5e550de0, https://10.0.0.4:2380, https://10.0.0.4:2379, false | ||
c11852e25aef34e8, started, test-master3-0ed051a3, https://10.0.0.2:2380, https://10.0.0.2:2379, false | ||
``` | ||
|
||
- [ ] If `master2` is not in Helsinki, follow the same steps as with `master1` but for `master2`. This will recreate `master2` in Helsinki. | ||
- [ ] If `master3` is not in Nuremberg, repeat the process for `master3`. This will recreate `master3` in Nuremberg. | ||
|
||
That’s it! You now have a regional cluster, which ensures continued operation even if one of the Hetzner locations experiences a temporary failure. I also recommend enabling the `create_load_balancer_for_the_kubernetes_api` setting to `true` if you don’t already have a load balancer for the Kubernetes API. | ||
|
||
## Performance Considerations | ||
|
||
This feature has been frequently requested, but I delayed implementing it until I could thoroughly test the configuration. I was concerned about latency issues, as etcd is sensitive to delays, and I wanted to ensure that the latency between the German locations and Helsinki wouldn’t cause problems. | ||
|
||
It turns out that the default heartbeat interval for etcd is 100ms, and the latency between Helsinki and Falkenstein/Nuremberg is only 25-27ms. This means the total round-trip time (RTT) for the Raft consensus is around 60-70ms, which is well within etcd’s acceptable limits. After running benchmarks, everything works smoothly! So, there’s no need to adjust the etcd configuration for this setup. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
# Upgrading a cluster created with hetzner-k3s v1.x to v2.x | ||
|
||
The v1 version of hetzner-k3s is quite old and hasn't been supported for a while, but I know that some people haven't upgraded to v2 because until now there wasn't a straightforward process to do this. | ||
|
||
This migration is now possible and straightforward provided you follow these instructions carefully and are patient. The migration also allows you to replace deprecated instance types (series `CX`) with new instance types. This migration requires hetzner-k3s v2.2.3 or higher. | ||
|
||
## Prerequisites | ||
|
||
- [ ] I recommend you install the [hcloud utility](https://github.com/hetznercloud/cli) to more easily/quickly delete old masters | ||
|
||
## Upgrading configuration and first steps | ||
|
||
- [ ] ==Backup apps and data== - like with all migrations, there is some risk involved, so be prepared in case something doesn't go according to the plan | ||
- [ ] ==Backup kubeconfig and old config file== | ||
- [ ] Uninstall the System Upgrade Controller | ||
- [ ] Create resolv file on existing nodes, either manually or automate it with the `hcloud` CLI | ||
```bash | ||
hcloud server list | awk '{print $4}' | tail -n +2 | while read ip; do | ||
echo "Setting DNS for ${ip}" | ||
ssh -n root@${ip} "echo nameserver 8.8.8.8 | tee /etc/k8s-resolv.conf" | ||
ssh -n root@${ip} "cat /etc/k8s-resolv.conf" | ||
done | ||
``` | ||
- [ ] Convert config file to new format https://github.com/vitobotta/hetzner-k3s/releases/tag/v2.0.0 | ||
- [ ] Comment out or remove empty node pools from the config file | ||
- [ ] Set `embedded_registry_mirror: enabled: false` if needed, depending on the current version of k3s (https://docs.k3s.io/installation/registry-mirror) | ||
- [ ] Add `legacy_instance_type` to ==ALL== node pools, both master and workers, set to the current instance type (regardless of whether it's deprecated or not). ==This is crucial for the migration== | ||
- [ ] Run `create` command ==with latest hetnzer-k3s using the new config file== | ||
- [ ] Wait for all CSI pods in `kube-system` to restart, ==ensure everything is running== | ||
|
||
## Rotating control plane instances with the new instance type | ||
|
||
One master per time (==Switch context before rotating master1== unless your cluster has a load balancer for the Kubernetes API): | ||
|
||
- [ ] Drain and delete the master both with kubectl and from the Hetzner console (or using the `hcloud` CLI) to also delete the actual instance | ||
- [ ] Rerun the `create` command to recreate the master with the new instance type, wait for it to join the control plane and be in "ready" status | ||
- [ ] SSH into each master and verify that the etcd members have been updated correctly and are in sync | ||
```bash | ||
sudo apt-get update | ||
sudo apt-get install etcd-client | ||
|
||
export ETCDCTL_API=3 | ||
export ETCDCTL_ENDPOINTS=https://127.0.0.1:2379 | ||
export ETCDCTL_CACERT=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt | ||
export ETCDCTL_CERT=/var/lib/rancher/k3s/server/tls/etcd/server-client.crt | ||
export ETCDCTL_KEY=/var/lib/rancher/k3s/server/tls/etcd/server-client.key | ||
|
||
etcdctl member list | ||
``` | ||
|
||
Repeat the process for each master carefully. After the three masters have been replaced: | ||
|
||
- [ ] Rerun the `create` command once or twice to ensure config is stable and the masters don't get restarted anymore | ||
- [ ] [Debug DNS resolution](https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/). If there are issues with it, restart the agents for DNS resolution with the command below, then restart CoreDNS | ||
```bash | ||
hcloud server list | grep worker | awk '{print $4}'| while read ip; do | ||
echo "${ip}" | ||
ssh -n root@${ip} "systemctl restart k3s-agent" | ||
sleep 10 | ||
done | ||
``` | ||
- [ ] Address any issues with your workloads, if any, before proceeding with the rotation of the worker nodes | ||
|
||
## Rotating a worker node pool | ||
|
||
- [ ] Increase node count for the pool by 1 | ||
- [ ] Run the `create` command to create the extra node required during the pool rotation | ||
|
||
One worker node per time (apart from the last one you've just added): | ||
|
||
- [ ] Drain a node | ||
- [ ] Delete the drained node both with kubectl and from the Hetzner console (or using the `hcloud` CLI) | ||
- [ ] Rerun `create` command to recreate the deleted node | ||
- [ ] Verify that all works as expected before proceeding with the next node in the pool | ||
|
||
Once all the existing nodes have been rotated: | ||
|
||
- [ ] Drain the very last node in the pool which we added earlier | ||
- [ ] Verify that all looks good | ||
- [ ] Delete the very last node both with kubectl and from the Hetzner console (or using the `hcloud` CLI) | ||
- [ ] Update the `instance_count` for the node pool by -1 | ||
- [ ] Proceed with the next pool | ||
|
||
## Finalizing | ||
|
||
- [ ] Remove the `legacy_instance_type` setting from both master and worker node pools | ||
- [ ] Re-run the `create` command once again to double check | ||
- [ ] Optionaly, convert the currently zonal cluster to a regional one with masters in different locations (see [this](Upgrading_a_cluster_from_1x_to_2x.md)). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
require "./node_pool" | ||
|
||
class Configuration::MasterNodePool < Configuration::NodePool | ||
property locations : Array(String) = ["fsn1"] of String | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.