Merge pull request #2 from janeczku/dev

janeczku · Oct 14, 2020 · 745e2d7 · 745e2d7
2 parents 648cd4b + 909f296
commit 745e2d7
Show file tree

Hide file tree

Showing 8 changed files with 356 additions and 103 deletions.
diff --git a/chart/Chart.yaml b/chart/Chart.yaml
@@ -1,9 +1,9 @@
 apiVersion: v1
-description: Manage a VRRP-based VIP for Kubernetes Ingress Controller
+description: VRRP-based failover VIP for Kubernetes Ingress Controllers and API servers
 icon: https://raw.githubusercontent.com/janeczku/keepalived-ingress-vip/master/chart/icon.png
 name: keepalived-ingress-vip
-version: v0.1.3
-appVersion: v0.1.3
+version: v0.1.4
+appVersion: v0.1.4
 home: https://www.github.com/janeczku/keepalived-ingress-vip
 sources:
 - https://www.github.com/janeczku/keepalived-ingress-vip

diff --git a/chart/README.md b/chart/README.md
@@ -2,15 +2,17 @@
 
 ![Hero Banner](https://raw.githubusercontent.com/janeczku/keepalived-ingress-vip/master/img/banner-top.png)
 
-This is a HA/IP failover solution for Kubernetes Ingress Controllers, such as [NGINX Ingress](https://kubernetes.github.io/ingress-nginx/) and mainly useful for on-premises environments that lack infrastructure load balancing capabilities.
+This is a lightweight HA/IP failover solution that provides floating IP addresses (VIPs) for external access to public-facing services running on Kubernetes nodes, such as Ingress controllers or the Kubernetes API servers.
 
-It provides a virtual IP for terminating ingress traffic on the cluster and provides sub-second failover in scenarios such as node crashing, node being drained during cluster upgrades or node becoming network partitioned from Kubernetes control plane.
+It's especially suited for situations where Kubernetes clusters are deployed in infrastructure that lacks (managed) load balancers, such as in on-premise data centers or edge.
 
-The Virtual IP is managed using the [Virtual Router Redundancy Protocol (VRRP) implementation of Keepalived](https://keepalived.readthedocs.io/en/latest/case_study_failover.html) and - instead of relying on Kubernetes API state (which would be to slow for this purpose) - the readiness of the ingress nodes is determined by running probes against local HTTP health check endpoints (Ingress, Kubelet).
+The solution is deployed as Helm application and provides sub-second L2 failover for typical failure scenarios such as nodes crashing or worker nodes becoming network partitioned from the Kubernetes control plane.
+
+The Virtual IP is managed using the [Virtual Router Redundancy Protocol (VRRP) implementation of Keepalived](https://keepalived.readthedocs.io/en/latest/case_study_failover.html) and - instead of relying on Kubernetes API state (which would be to slow for this purpose) - the eligibility of a node to host the VIP is determined by running probes against local and remote HTTP health check endpoints (e.g. NGINX Ingress Controller, Kubelet, K8s API server).
 
 ## Prerequisites
 
-- Network infrastructure must permit the use of VRRP protocol and multicast traffic (which excludes most public clouds)
+- Network infrastructure must permit the use of VRRP protocol and multicast traffic (which is a no for most public clouds)
 - Tested on a vSphere 6.7u3 environment with stock vSwitch and port group setup. Other non-cloud infrastructure providers should work.
 - The Chart included in the repo requires Helm >= v3.1
 
@@ -26,10 +28,10 @@ Afterwards, you can launch the chart from the System project's __Apps__ page pro
 
 ### Installing the Chart using Helm CLI
 
-To run the application using the virtual IP `172.16.135.2/21` on the host interface named `ens160`:
+Using the Helm CLI you must specify the required configuration options using `--set <variable>=<value>`.
 
 ```bash
-$ helm install -n vip --name keepalived-ingress-vip ./chart \
+$ helm install keepalived-ingress-vip ./chart -n vip \
   --set keepalived.vrrpInterfaceName=ens160 \
   --set keepalived.vipInterfaceName=ens160 \
   --set keepalived.vipAddressCidr="172.16.135.2/21"
@@ -43,36 +45,173 @@ To uninstall the `keepalived-ingress-vip` deployment:
 $ helm delete -n vip keepalived-ingress-vip
 ```
 
-### Configuration
+### Example Configurations
+
+#### Provision a VIP as a high available endpoint for cluster ingress (e.g. NGINX Ingress Controller)
+
+Example Helm values.yaml:
+
+```yaml
+keepalived:
+  # interface used for the VRRP protocol
+  vrrpInterfaceName: eth9
+  # interface to attach the VIP to
+  vipInterfaceName: eth0
+  # The floating IP address in CIDR format
+  vipAddressCidr: "172.16.135.2/21"
+
+  # NGiNX Ingress Controller health check endpoint   
+  checkServiceUrl: http://127.0.0.1:10254/healthz
+  # If the Kubelet is down, the node will be marked as failed and VIP
+  # moved to a healthy node immediately
+  checkKubelet: true
+  # If the Kubernetes API can't be reached from the node, the node's
+  # priority for hosting the VIP will be reduced
+  checkKubeApi: true 
+  # optional: Tolerate an unhealthy Kubelet for up to 30 seconds
+  # (e.g. to prevent VIP flapping during a planned K8s upgrade)
+  checkKubeletInterval=3
+  checkKubeletFailAfter=10
+
+# optional: If the ingress controller is running on designated nodes only,
+# make sure the VIP is scheduled to the same set of nodes
+pod:
+  nodeSelector:
+    nodeRole: ingress
+```
+
+#### Provision a VIP as a high available K8s API endpoint for a multi-master cluster
+
+Example Helm values.yaml:
+
+```yaml
+keepalived:
+  # interface used for the VRRP protocol
+  vrrpInterfaceName: eth0
+  # interface to attach the VIP to
+  vipInterfaceName: eth0
+  # The floating IP address in CIDR format
+  vipAddressCidr: "172.16.135.2/21"
+
+  # Health check the local K8s API service (URL might vary depending on k8s distro)
+  checkServiceUrl: http://127.0.0.1:6443/healthz
+  checkKubelet: false
+  checkKubeApi: false
+
+# Daemonset is used because we always want a Keepalived instance on every master node
+kind: Daemonset
+
+pod:
+  # Ensure that the VIP is only scheduled on master nodes
+  nodeSelector:
+    node-role.kubernetes.io/controlplane: "true"
+  # Tolerate master taints 
+  tolerateMasterTaints: true
+```
+
+#### Provide a VIP as an high available API endpoint for k3s clusters
+
+You can package a Helm resource file with k3s that will automatically attach a floating IP to a healthy master during cluster bootstrapping.
+
+Create the file `/var/lib/rancher/k3s/server/manifests/keepalived-api-vip.yaml` on the k3s server host:
+
+```yaml
+apiVersion: helm.cattle.io/v1
+kind: HelmChart
+metadata:
+  name: keepalived-ingress-vip
+  namespace: kube-system
+spec:
+  chart: keepalived-ingress-vip
+  version: 0.1.4
+  repo: https://janeczku.github.io/helm-charts/
+  targetNamespace: keepalived
+  valuesContent: |-
+    keepalived:
+      # interface used for the VRRP protocol
+      vrrpInterfaceName: ens160
+      # interface to attach the VIP to
+      vipInterfaceName: ens160
+      # The floating IP address in CIDR format
+      vipAddressCidr: "172.16.135.2/21"
+      # Health check the local K3s API endpoint
+      checkServiceUrl: http://127.0.0.1:6443/healthz
+      checkKubelet: false
+      checkKubeApi: false
+    # Daemonset is used because we always want a Keepalived instance on every master node
+    kind: Daemonset
+    pod:
+      # Schedule the VIP only to master nodes
+      nodeSelector:
+        node-role.kubernetes.io/controlplane: "true"
+      # Tolerate master taints 
+      tolerateMasterTaints: true
+````
+
+Once the k3s cluster is bootstrapped you can point your Kubernetes client to: `https://VIP:6443`.
+
+
+### Configuration Reference
 
 The following table lists the configurable parameters of this chart and their default values.
 
 | Parameter                           | Description                                                      | Default                                             |
 | ----------------------------------- | -----------------------------------------------------------------| --------------------------------------------------- |
-| `keepalived.enableDebug`            | Enable verbose logging                                           | `false`                                             |
+| `keepalived.debug`                  | Enable verbose logging                                           | `false`                                             |
 | `keepalived.authPassword`.          | Shared VRRP authentication key (1-8 chars)                       | _autogenerated_                                     |
 | `keepalived.vrrpInterfaceName`      | The host network interface name to use for VRRP traffic.         | `eth0`                                              |
 | `keepalived.vipInterfaceName`       | The host network interface name to attach the VIP to.            | `eth0`                                              |
 | `keepalived.vipAddressCidr`         | The Virtual IP address to use (in CIDR notation, e.g. `192.168.11.2/24`) | ``                                          |
 | `keepalived.virtualRouterId`        | A unique numeric Keepalived Router ID.                           | `10`                                                |
 | `keepalived.vrrpNoPreempt`          | Enable the Keepalived "nopreempt" option                         | `false`                                             |
-| `keepalived.ingressHealthzUrl`      | The URL to poll to determine Ingress health (expect HTTP status code 200) | `http://127.0.0.1:10254/healthz`           |
-| `keepalived.kubeletHealthzUrl`      | The URL to poll to determine Kubelet health (expect HTTP status code 200) | `http://127.0.0.1:10248/healthz`           |
+| `keepalived.checkServiceUrl`        | URL checked to determine availability of the service endpoint provided on the local node (expects HTTP status code 200) | `http://127.0.0.1:10254/healthz` (NGINX Ingress Controller) |
+| `keepalived.checkServiceInterval`   | Interval for the service health check in seconds                 | `2`                                                 |
+| `keepalived.checkServiceFailAfter`  | Number of failed service checks to allow before marking this Keepalived instance failed | `2`                          |
+| `keepalived.checkKubelet`           | Remove VIP from Keepalived instance running on node with unhealthy Kubelet | `true`                                    |
+| `keepalived.checkKubeletInterval`   | Interval for Kubelet health checks in seconds                    | `5`                                                 |
+| `keepalived.checkKubeletFailAfter`  | Number of failed Kubelet health checks before marking this Keepalived instance failed | `5`                            |
+| `keepalived.checkKubeletUrl`        | The URL checked to determine health of the local node Kubelet | `http://127.0.0.1:10248/healthz`                       |
+| `keepalived.checkKubeApi`           | Reduce priority of a Keepalived instance running on a node that fails to communicate with the K8s API server | `true`  |
+| `keepalived.checkKubeApiInterval`   | Interval for K8s API health checks in seconds                    | `5`                                                 |
+| `keepalived.checkKubeApiFailAfter`  | Number of failed K8s API health checks before reducing priority of the keepalived instance (VIP may then be moved to a higher priority instance) | `5` |
+| `kind`                              | The deployment resource to create for the Keepalived pods (one of 'Deployment' or 'Daemonset') | `Deployment`          |
 | `image.repository`                  | Image repository to pull from                                    | `janeczku/keepalived-ingress-vip`                   |
-| `image.tag`                         | Image tag to pull                                                | `v0.1.3`                                            |
+| `image.tag`                         | Image tag to pull                                                | `v0.1.4`                                            |
 | `image.pullPolicy`                  | Image pull policy                                                | `IfNotPresent`                                      |
 | `rbac.create`                       | Whether to create the required RBAC resources                    | `true`                                              |
 | `rbac.pspEnabled`                   | Whether to create the required PodSecurityPolicy                 | `false`                                             |
 | `serviceAccount.name`               | Use an existing service account instead of creating a new one.   | ``                                                  |
 | `pod.replicas`                      | The number of Keepalived instances to run in the cluster         | `2`                                                 |
+| `pod.priorityClassName`             | The priority class to assign the pods to                         | `system-cluster-critical`                           |
 | `pod.extraEnv`                      | Additional pod environment variables                             | `[]`                                                |
 | `pod.resources.requests.cpu`        | CPU resource requests                                            | 80m                                                 |
 | `pod.resources.limits.cpu`          | CPU resource limits                                              |                                                     |
 | `pod.resources.requests.memory`     | Memory resource requests                                         | 6Mi                                                 |
 | `pod.resources.limits.memory`       | Memory resource limits                                           | 12Mi                                                |
 | `pod.nodeSelector`                  | Node selector                                                    | `{}`                                                |
-| `pod.tolerations`                   | Additional pod taint tolerations                                 | `[]`                                                |
+| `pod.tolerations`                   | Custom pod taint tolerations                                     | see below for the default                           |
+| `pod.tolerateMasterTaints`          | Configure taint tolerations that allows pods to run on master nodes | `false`                                          |
 | `pod.affinity`                      | Additional pod affinity configuration                            | `{}`                                                |
 | `pod.imagePullSecrets`              | Array of image Pull Secrets                                      | `[]`                                                |
 
-Specify each parameter using the `--set key=value[,key=value]` argument to `helm install`.
+Specify each parameter using the `--set key=value[,key=value]` argument to `helm install`.
+
+#### Default Pod Taint Tolerations
+
+By default the following tolerations are set by the Helm chart. You may use the `pod.tolerations` variable to override the default.
+
+```
+  tolerations:
+  # If the the node becomes tainted as unreachable or not-ready one would typically want
+  # the Keepalived instance to be migrated to healthy node without much delay.
+  # Setting the tolerationSeconds value too low might cause the VIP to be evicted during a
+  # scheduled upgrade of the Kubelet (which might be the desired behaviour in most cases anyways).
+  - key: "node.kubernetes.io/unreachable"
+    operator: "Exists"
+    effect: "NoExecute"
+    tolerationSeconds: 15
+  - key: "node.kubernetes.io/not-ready"
+    operator: "Exists"
+    effect: "NoExecute"
+    tolerationSeconds: 15
+```
diff --git a/chart/templates/_helpers.tpl b/chart/templates/_helpers.tpl
@@ -36,3 +36,73 @@ Name of the service account to use
 {{- define "app.serviceAccountName" -}}
     {{ default (include "app.fullname" .) .Values.serviceAccountName }}
 {{- end -}}
+
+
+
+{{/*
+Define a set of required configuration environment variables to be
+shared across daemonset and deployment pod specs
+*/}}
+{{- define "environmentvars" -}}
+- name: NODE_NAME
+  valueFrom:
+    fieldRef:
+      fieldPath: spec.nodeName
+- name: AUTH_PASSWORD
+  valueFrom:
+    secretKeyRef:
+      name: {{ include "app.fullname" . }}
+      key: password
+- name: VRRP_IFACE
+  value: {{ .Values.keepalived.vrrpInterfaceName | quote }}
+- name: VIP_IFACE
+  value: {{ .Values.keepalived.vipInterfaceName | quote  }}
+- name: VIP_ADDR_CIDR
+  value: {{ .Values.keepalived.vipAddressCidr | quote }}
+- name: VIRTUAL_ROUTER_ID
+  value: {{ .Values.keepalived.virtualRouterId | quote  }}
+- name: VRRP_NOPREEMPT
+  value: {{ .Values.keepalived.vrrpNoPreempt | quote }}
+- name: CHECK_SERVICE_URL
+  value: {{ .Values.keepalived.checkServiceUrl | quote }}
+- name: CHECK_SERVICE_INTERVAL
+  value: {{ .Values.keepalived.checkServiceInterval | quote }}
+- name: CHECK_SERVICE_FAILAFTER
+  value: {{ .Values.keepalived.checkServiceFailAfter | quote }}
+- name: CHECK_KUBELET
+  value: {{ .Values.keepalived.checkKubelet | quote }}
+- name: CHECK_KUBELET_INTERVAL
+  value: {{ .Values.keepalived.checkKubeletInterval | quote }}
+- name: CHECK_KUBELET_FAILAFTER
+  value: {{ .Values.keepalived.checkKubeletFailAfter | quote }}
+- name: CHECK_KUBELET_URL
+  value: {{ .Values.keepalived.checkKubeletUrl | quote }}
+- name: CHECK_KUBEAPI
+  value: {{ .Values.keepalived.checkKubeApi | quote }}
+- name: CHECK_KUBEAPI_INTERVAL
+  value: {{ .Values.keepalived.checkKubeApiInterval | quote }}
+- name: CHECK_KUBEAPI_FAILAFTER
+  value: {{ .Values.keepalived.checkKubeApiFailAfter | quote }}
+{{- if .Values.pod.extraEnv }}
+{{ toYaml .Values.pod.extraEnv }}
+{{- end }}
+{{- end }}
+
+{{/*
+Generate the pod tolerations
+*/}}
+{{- define "tolerations" -}}
+{{- if .Values.pod.tolerations -}}
+{{ toYaml .Values.pod.tolerations }}
+{{ end }}
+{{- if .Values.pod.tolerateMasterTaints -}}
+- key: "node-role.kubernetes.io/controlplane"
+  operator: "Equal"
+  value: "true"
+  effect: "NoSchedule"
+- key: "node-role.kubernetes.io/etcd"
+  operator: "Equal"
+  value: "true"
+  effect: "NoExecute"
+{{- end -}}
+{{- end -}}
diff --git a/chart/templates/daemonset.yaml b/chart/templates/daemonset.yaml
@@ -0,0 +1,55 @@
+{{- if eq (.Values.kind  | lower) "daemonset" }}
+apiVersion: apps/v1
+kind: DaemonSet
+metadata:
+  name: {{ template "app.fullname" . }}
+  labels:
+    app.kubernetes.io/name: {{ template "app.name" . }}
+    helm.sh/chart: {{ template "app.chart" . }}
+    app.kubernetes.io/instance: {{ .Release.Name }}
+    app.kubernetes.io/managed-by: {{ .Release.Service }}
+spec:
+  selector:
+    matchLabels:
+      app.kubernetes.io/name: {{ template "app.name" . }}
+      app.kubernetes.io/instance: {{ .Release.Name }}
+  template:
+    metadata:
+      labels:
+        app.kubernetes.io/name: {{ template "app.name" . }}
+        app.kubernetes.io/instance: {{ .Release.Name }}
+    spec:
+      priorityClassName: {{ .Values.pod.priorityClassName }}
+      {{- if .Values.pod.affinity }}
+      affinity:
+{{ toYaml .Values.pod.affinity | indent 8 }}
+      {{- end }}
+      {{- if .Values.pod.nodeSelector }}
+      nodeSelector:
+{{ toYaml .Values.pod.nodeSelector | indent 8 }}
+      {{- end }}
+      tolerations:
+{{ include "tolerations" . | indent 8 }}
+      {{- if .Values.pod.imagePullSecrets }}
+      imagePullSecrets:
+{{ toYaml .Values.imagePullSecrets | indent 8 }}
+      {{- end }}
+      serviceAccountName: {{ template "app.serviceAccountName" . }}
+      hostNetwork: true
+      containers:
+      - name: {{ .Chart.Name }}
+        image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
+        imagePullPolicy: {{ .Values.image.pullPolicy }}
+        {{- if .Values.keepalived.debug }}
+        args: ["debug"]
+        {{- end }}
+        securityContext:
+          capabilities:
+            add: ["NET_ADMIN"]
+        env:
+{{ include "environmentvars" . | indent 10 }}
+        {{- if .Values.pod.resources }}
+        resources:
+{{ toYaml .Values.pod.resources | indent 10 }}
+        {{- end }}
+{{- end }}