failed to fail-over resource kubernetes Version: v1.30.0 #59

yeshl · 2024-04-23T07:51:32Z

I0423 06:45:42.066690 1 agent.go:253] starting reconciliation
I0423 06:45:52.066708 1 agent.go:253] starting reconciliation
I0423 06:45:52.066824 1 reconcile_failover.go:137] resource 'pvc-5402447b-9617-4764-902b-93ae4cea6106' on node 'master22.host' has failed, evicting
W0423 06:46:05.985734 1 reflector.go:462] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:229: watch of *v1.Pod ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
W0423 06:46:05.985738 1 reflector.go:462] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:229: watch of *v1.VolumeAttachment ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
W0423 06:46:05.985770 1 reflector.go:462] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:229: watch of *v1.Node ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
W0423 06:46:05.985738 1 reflector.go:462] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:229: watch of *v1.PersistentVolume ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
E0423 06:46:05.985948 1 reconcile_failover.go:141] "failed to fail-over resource" err="failed to apply node taint: Put "https://10.96.0.1:443/api/v1/nodes/master22.host?fieldManager=linstor.linbit.com%2Fhigh-availability-controller%2Fv2\": http2: client connection lost"
I0423 06:46:05.986006 1 agent.go:253] starting reconciliation
I0423 06:46:05.986111 1 reconcile_failover.go:137] resource 'pvc-5402447b-9617-4764-902b-93ae4cea6106' on node 'master22.host' has failed, evicting
E0423 06:46:05.997341 1 reconcile_failover.go:141] "failed to fail-over resource" err="failed force detach: volumeattachments.storage.k8s.io "csi-28b5875796ad4197fe5c795c0ce064930dc9536179e69c3d0edaaf92121ee99b" not found"
I0423 06:46:12.066698 1 agent.go:253] starting reconciliation
I0423 06:46:12.066840 1 reconcile_failover.go:137] resource 'pvc-5402447b-9617-4764-902b-93ae4cea6106' on node 'master22.host' has failed, evicting
I0423 06:46:22.067170 1 agent.go:253] starting reconciliation
I0423 06:46:22.067312 1 reconcile_failover.go:137] resource 'pvc-5402447b-9617-4764-902b-93ae4cea6106' on node 'master22.host' has failed, evicting
I0

WanzenBug · 2024-04-23T11:37:14Z

unable to decode an event from the watch stream: http2: client connection lost

This does not seem related to kubernetes 1.30 or even the HA Controller. It looks like the node master22.host went away, which was probably also hosting the the Kubernetes control plane. The HA Controller will simply retry later, which it indeed did:

I0423 06:46:12.066698 1 agent.go:253] starting reconciliation
I0423 06:46:12.066840 1 reconcile_failover.go:137] resource 'pvc-5402447b-9617-4764-902b-93ae4cea6106' on node 'master22.host' has failed, evicting
I0423 06:46:22.067170 1 agent.go:253] starting reconciliation
I0423 06:46:22.067312 1 reconcile_failover.go:137] resource 'pvc-5402447b-9617-4764-902b-93ae4cea6106' on node 'master22.host' has failed, evicting

At which point it did not encounter any issues. Failover times can be influenced by the amount of work the rest of the cluster has to handle. If a master node fails, failover times can be slower, as the k8s API in general gets slower.

yeshl · 2024-04-24T00:32:28Z

感谢回复，我配置的是3个master节点的集群，piraeus-operator按doc进行的安装配置，模拟了一个节点故障（shutdown或拔掉网线，k8s集群依然可用）但是piraeus无法完成故障转移，无限evicting，已经超过24分钟。稍后我用worker节点测试一下。

root@master21:~# poweroff

root@master20:~# kubectl get no
NAME STATUS ROLES AGE VERSION
master20.host Ready control-plane 2d12h v1.30.0
master21.host NotReady control-plane 2d11h v1.30.0
master22.host Ready control-plane 2d11h v1.30.0

root@master20:~# drbdadm status
pvc-5402447b-9617-4764-902b-93ae4cea6106 role:Secondary
disk:Diskless
master21.host connection:Connecting
master22.host role:Secondary
peer-disk:UpToDate

root@master22:# drbdadm status
pvc-5402447b-9617-4764-902b-93ae4cea6106 role:Secondary
disk:UpToDate
master20.host role:Secondary
peer-disk:Diskless
master21.host connection:Connecting
root@master22:# kubectl get pod -n piraeus-datastore -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
ha-controller-7bdmq 1/1 Running 4 (16h ago) 35h 10.244.2.138 master22.host
ha-controller-g2hlz 1/1 Running 3 (14h ago) 35h 10.244.1.26 master21.host
ha-controller-z59b2 1/1 Running 1 (20h ago) 35h 10.244.0.250 master20.host
root@master22:~# kubectl logs -n piraeus-datastore ha-controller-7bdmq
I0423 23:57:52.307596 1 agent.go:253] starting reconciliation
I0423 23:58:02.307055 1 agent.go:253] starting reconciliation
I0423 23:58:12.307170 1 agent.go:253] starting reconciliation
I0423 23:58:22.307161 1 agent.go:253] starting reconciliation
I0423 23:58:22.307298 1 reconcile_failover.go:137] resource 'pvc-5402447b-9617-4764-902b-93ae4cea6106' on node 'master21.host' has failed, evicting
I0423 23:58:32.307523 1 agent.go:253] starting reconciliation
I0423 23:58:32.307664 1 reconcile_failover.go:137] resource 'pvc-5402447b-9617-4764-902b-93ae4cea6106' on node 'master21.host' has failed, evicting
I0423 23:58:42.307155 1 agent.go:253] starting reconciliation
I0423 23:58:42.307310 1 reconcile_failover.go:137] resource 'pvc-5402447b-9617-4764-902b-93ae4cea6106' on node 'master21.host' has failed, evicting
...省略
I0424 00:21:52.307448 1 reconcile_failover.go:137] resource 'pvc-5402447b-9617-4764-902b-93ae4cea6106' on node 'master21.host' has failed, evicting
I0424 00:22:02.307098 1 agent.go:253] starting reconciliation
I0424 00:22:02.307237 1 reconcile_failover.go:137] resource 'pvc-5402447b-9617-4764-902b-93ae4cea6106' on node 'master21.host' has failed, evicting
I0424 00:22:12.306998 1 agent.go:253] starting reconciliation
I0424 00:22:12.307112 1 reconcile_failover.go:137] resource 'pvc-5402447b-9617-4764-902b-93ae4cea6106' on node 'master21.host' has failed, evicting

root@master22:~# kubectl logs -n piraeus-datastore ha-controller-z59b2
I0423 23:58:02.067259 1 agent.go:253] starting reconciliation
I0423 23:58:12.067417 1 agent.go:253] starting reconciliation
I0423 23:58:22.067711 1 agent.go:253] starting reconciliation
I0423 23:58:22.067900 1 reconcile_failover.go:137] resource 'pvc-5402447b-9617-4764-902b-93ae4cea6106' on node 'master21.host' has failed, evicting
I0423 23:58:32.067219 1 agent.go:253] starting reconciliation
I0423 23:58:32.067359 1 reconcile_failover.go:137] resource 'pvc-5402447b-9617-4764-902b-93ae4cea6106' on node 'master21.host' has failed, evicting
I0423 23:58:42.066787 1 agent.go:253] starting reconciliation
I0423 23:58:42.066917 1 reconcile_failover.go:137] resource 'pvc-5402447b-9617-4764-902b-93ae4cea6106' on node 'master21.host' has failed, evicting
I0423 23:58:52.066917 1 agent.go:253] starting reconciliation
I0423 23:58:52.067053 1 reconcile_failover.go:137] resource 'pvc-5402447b-9617-4764-902b-93ae4cea6106' on node 'master21.host' has failed, evicting
I0423 23:59:02.067629 1 agent.go:253] starting reconciliation
...省略
I0424 00:24:02.067200 1 reconcile_failover.go:137] resource 'pvc-5402447b-9617-4764-902b-93ae4cea6106' on node 'master21.host' has failed, evicting
I0424 00:24:12.067010 1 agent.go:253] starting reconciliation
I0424 00:24:12.067144 1 reconcile_failover.go:137] resource 'pvc-5402447b-9617-4764-902b-93ae4cea6106' on node 'master21.host' has failed, evicting
I0424 00:24:22.067629 1 agent.go:253] starting reconciliation
I0424 00:24:22.067752 1 reconcile_failover.go:137] resource 'pvc-5402447b-9617-4764-902b-93ae4cea6106' on node 'master21.host' has failed, evicting
I0424 00:24:32.067570 1 agent.go:253] starting reconciliation
I0424 00:24:32.067823 1 reconcile_failover.go:137] resource 'pvc-5402447b-9617-4764-902b-93ae4cea6106' on node 'master21.host' has failed, evicting

yeshl · 2024-04-25T07:26:31Z

when run on a node not master ,it cannot failover either！why？ when i force delete the Terminating pod it can schedule to Secondary node and start running!
I0425 07:09:31.328542 1 agent.go:253] starting reconciliation
I0425 07:09:31.328717 1 reconcile_failover.go:137] resource 'pvc-a28ab865-bf44-4c53-9408-303694756133' on node 'node50.host' has failed, evicting
I0425 07:09:41.328786 1 agent.go:253] starting reconciliation
I0425 07:09:41.328921 1 reconcile_failover.go:137] resource 'pvc-a28ab865-bf44-4c53-9408-303694756133' on node 'node50.host' has failed, evicting
I0425 07:09:51.328672 1 agent.go:253] starting reconciliation
I0425 07:09:51.328824 1 reconcile_failover.go:137] resource 'pvc-a28ab865-bf44-4c53-9408-303694756133' on node 'node50.host' has failed, evicting
I0425 07:10:01.329536 1 agent.go:253] starting reconciliation
I0425 07:10:01.329677 1 reconcile_failover.go:137] resource 'pvc-a28ab865-bf44-4c53-9408-303694756133' on node 'node50.host' has failed, evicting

yeshl · 2024-04-25T07:35:29Z

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: sc-piraeus-r2-ha
provisioner: linstor.csi.linbit.com
reclaimPolicy: Delete
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer
parameters:
  csi.storage.k8s.io/fstype: xfs
  linstor.csi.linbit.com/storagePool: pool-01
  linstor.csi.linbit.com/placementCount: "2"
  linstor.csi.linbit.com/allowRemoteVolumeAccess: "false"
  property.linstor.csi.linbit.com/DrbdOptions/auto-quorum: suspend-io
  property.linstor.csi.linbit.com/DrbdOptions/Resource/on-no-data-accessible: suspend-io
  property.linstor.csi.linbit.com/DrbdOptions/Resource/on-suspended-primary-outdated: force-secondary
  property.linstor.csi.linbit.com/DrbdOptions/Net/rr-conflict: retry-connect
---
apiVersion: v1
kind: Service
metadata:
  name: test-svc-web
spec:
  ports:
    - port: 80
      name: web
  clusterIP: None
  selector:
    app: web
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: test-sts-web
spec:
  selector:
    matchLabels:
      app: web
  serviceName: "test-svc-web"
  replicas: 1
  template:
    metadata:
      labels:
        app: web
    spec:
      containers:
        - name: nginx
          image: nginx:1.25.5-alpine
          ports:
            - containerPort: 80
          volumeMounts:
            - name: pvc
              mountPath: /mnt/data
  volumeClaimTemplates:
    - metadata:
        name: pvc
      spec:
        accessModes: [ "ReadWriteOnce" ]
        storageClassName: sc-piraeus-r2-ha
        resources:
          requests:
            storage: 2Gi

WanzenBug · 2024-04-25T07:40:59Z

You could try turning up the verbosity of the HA Controller to see what it tries to do. Edit the LinstorCluster resource to contain:

...
spec:
  highAvailabilityController:
    podTemplate:
      spec:
        containers:
        - name: ha-controller
          args:
          - /agent
          - --v=3

yeshl · 2024-04-25T09:47:28Z

Pod 'default/test-sts-web-0' is exempt from eviction because of unsafe volumes What does it mean?

containers:
        - name: c-web-server
          image: busybox
          imagePullPolicy: IfNotPresent #default Always
          env:
            - name: NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
            - name: POD_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: SVC_NAME
              value: "svc-headless"
            - name: DEFAULT_TZ
              value: "Asia/Shanghai"
          command:
            - sh
            - '-c'
            - |-
              trap 'exit 0' SIGTERM
              #rm /mnt/data/index.html
              while true; do
                echo [$(date "+%Y-%m-%d %T")] - $HOSTNAME - $POD_IP '<br>'  |tee -a /mnt/data/index.html
                #touch  /mnt/data/f-$(date +"%Y-%m-%d_%H-%M-%S").txt
                sleep 10
              done
          volumeMounts:
            - name: localtime
              mountPath: /etc/localtime
              readOnly: true
            - name: pvc
              mountPath: /mnt/data
        - name: nginx
          image: nginx:1.25.5-alpine
          env:
            - name: TZ
              value: "Asia/Shanghai"
          ports:
            - containerPort: 80
          volumeMounts:
            - name: conf
              mountPath: /etc/nginx/conf.d/default.conf
              subPath: fileserver.conf
            - name: pvc
              mountPath: /mnt/data
      volumes:
        - name: localtime
          hostPath:
            type: File
            path: /etc/localtime
        - name: conf
          configMap:
            name: test-cm-nginx
#            defaultMode: 0755
  volumeClaimTemplates:
    - metadata:
        name: pvc
      spec:
        accessModes: [ "ReadWriteOnce" ]
        storageClassName: sc-piraeus-r2-ha
        resources:
          requests:
            storage: 2Gi

WanzenBug · 2024-04-25T09:54:06Z

Because the pod has a hostPath volume mounted, the HA Controller believes it can't fail over this volume. See https://github.com/piraeusdatastore/piraeus-ha-controller/blob/main/pkg/agent/reconcile_failover.go#L262-L296

Why? Because if you had a host path volume and you evicted the Pod and it starts on another node, that volume has now different content. At least that was the idea: only fail over Pods that only have "safe" volumes, i.e. DRBD volume or other ephemeral volumes.

Looks like in this case it would also be safe, as the /etc/localtime is readOnly... Perhaps we can improve that check.

You can try running the ha controller with --fail-over-unsafe-pods, see if it works then.

yeshl · 2024-04-25T10:13:28Z

thank you! it can failover when i remove localtime volume! I unplug the network cable to simulate server down, and then plug it back in after a while to restore the network, i expect the primary to become secondary，but it doesn't。so i reboot the server,then it become secondary! how it can auto change primary to secondary after network restored，no need to reboot server！

+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| Node          | Resource                                 | StoragePool          | VolNr | MinorNr | DeviceName    | Allocated | InUse  |      State |
|=====================================================================================================================================================|
| master20.host | pvc-f7d167b9-f486-4cc8-8281-4e7d304819c4 | pool-01              |     0 |    1000 | /dev/drbd1000 |  2.62 MiB | InUse  |   UpToDate |
| master21.host | pvc-f7d167b9-f486-4cc8-8281-4e7d304819c4 | DfltDisklessStorPool |     0 |    1000 | /dev/drbd1000 |           | Unused | TieBreaker |
| master22.host | pvc-f7d167b9-f486-4cc8-8281-4e7d304819c4 | pool-01              |     0 |    1000 | /dev/drbd1000 | 64.80 MiB |        |    Unknown |

WanzenBug · 2024-04-25T10:45:29Z

The HA Controller on the "old" Primary node should see that a Pod is stuck in suspend-io and force it to become secondary using drbdadm secondary --force.

yeshl · 2024-04-26T09:24:10Z

+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| Node          | Resource                                 | StoragePool          | VolNr | MinorNr | DeviceName    | Allocated | InUse  |      State |
|=====================================================================================================================================================|
| master20.host | pvc-2028ef6c-82a1-4e7f-8bdf-4179cee1bbd9 | DfltDisklessStorPool |     0 |    1000 | /dev/drbd1000 |           | Unused | TieBreaker |
| node50.host   | pvc-2028ef6c-82a1-4e7f-8bdf-4179cee1bbd9 | pool-01              |     0 |    1000 | /dev/drbd1000 | 64.80 MiB |        |    Unknown |
| node51.host   | pvc-2028ef6c-82a1-4e7f-8bdf-4179cee1bbd9 | pool-01              |     0 |    1000 | /dev/drbd1000 |  2.62 MiB | InUse  |   UpToDate |
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
root@master20:~# kubectl get po -n piraeus-datastore -o wide|grep ha
ha-controller-6zlcf                                    1/1     Running   0          27m   10.244.2.250   master22.host   <none>           <none>
ha-controller-7fjnl                                    1/1     Running   0          27m   10.244.4.83    node51.host     <none>           <none>
ha-controller-7p82f                                    1/1     Running   0          27m   10.244.0.118   master20.host   <none>           <none>
ha-controller-bb47w                                    1/1     Running   0          27m   10.244.3.130   node50.host     <none>           <none>
ha-controller-ltjjt                                    1/1     Running   0          27m   10.244.1.126   master21.host   <none>           <none>
root@master20:~# kubectl  -n piraeus-datastore exec ha-controller-bb47w -- drbdadm status
pvc-2028ef6c-82a1-4e7f-8bdf-4179cee1bbd9 role:Secondary suspended:quorum
  disk:UpToDate quorum:no blocked:upper
  master20.host connection:Connecting
  node51.host connection:Connecting
root@master20:~# kubectl  -n piraeus-datastore exec ha-controller-bb47w -- drbdadm secondary --force pvc-2028ef6c-82a1-4e7f-8bdf-4179cee1bbd9
no resources defined!
command terminated with exit code 1

WanzenBug · 2024-04-26T09:31:41Z

Sorry, should have been drbdsetup secondary --force

yeshl · 2024-04-27T01:18:25Z

Can it reconnect and recovery be done automatically?

WanzenBug mentioned this issue Apr 23, 2024

failed to fail-over resource piraeusdatastore/piraeus-operator#651

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

failed to fail-over resource kubernetes Version: v1.30.0 #59

failed to fail-over resource kubernetes Version: v1.30.0 #59

yeshl commented Apr 23, 2024

WanzenBug commented Apr 23, 2024

yeshl commented Apr 24, 2024

yeshl commented Apr 25, 2024

yeshl commented Apr 25, 2024

WanzenBug commented Apr 25, 2024

yeshl commented Apr 25, 2024

WanzenBug commented Apr 25, 2024

yeshl commented Apr 25, 2024

WanzenBug commented Apr 25, 2024

yeshl commented Apr 26, 2024

WanzenBug commented Apr 26, 2024

yeshl commented Apr 27, 2024

failed to fail-over resource kubernetes Version: v1.30.0 #59

failed to fail-over resource kubernetes Version: v1.30.0 #59

Comments

yeshl commented Apr 23, 2024

WanzenBug commented Apr 23, 2024

yeshl commented Apr 24, 2024

yeshl commented Apr 25, 2024

yeshl commented Apr 25, 2024

WanzenBug commented Apr 25, 2024

yeshl commented Apr 25, 2024

WanzenBug commented Apr 25, 2024

yeshl commented Apr 25, 2024

WanzenBug commented Apr 25, 2024

yeshl commented Apr 26, 2024

WanzenBug commented Apr 26, 2024

yeshl commented Apr 27, 2024