Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]Redis pod crash after upgrade kb from 0.5.1 to 0.6.2 #5000

Closed
ahjing99 opened this issue Sep 4, 2023 · 1 comment
Closed

[BUG]Redis pod crash after upgrade kb from 0.5.1 to 0.6.2 #5000

ahjing99 opened this issue Sep 4, 2023 · 1 comment
Assignees
Labels
bug kind/bug Something isn't working
Milestone

Comments

@ahjing99
Copy link
Collaborator

ahjing99 commented Sep 4, 2023

  1. install kb 0.5.1, create clusters
  2. upgrade kb to 0.6.2-beta.1
  3. lots of redis pod crash
➜  ~ k describe cluster redis-2q8t6
Name:         redis-2q8t6
Namespace:    default
Labels:       clusterdefinition.kubeblocks.io/name=redis
              clusterversion.kubeblocks.io/name=redis-7.0.6
Annotations:  kubeblocks.io/reconcile: 2023-09-04T09:12:39.67959538Z
API Version:  apps.kubeblocks.io/v1alpha1
Kind:         Cluster
Metadata:
  Creation Timestamp:  2023-09-04T08:09:45Z
  Finalizers:
    cluster.kubeblocks.io/finalizer
  Generate Name:  redis-
  Generation:     1
  Managed Fields:
    API Version:  apps.kubeblocks.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:generateName:
        f:labels:
          .:
          f:clusterdefinition.kubeblocks.io/name:
          f:clusterversion.kubeblocks.io/name:
      f:spec:
        .:
        f:affinity:
          .:
          f:nodeLabels:
          f:podAntiAffinity:
          f:tenancy:
          f:topologyKeys:
        f:clusterDefinitionRef:
        f:clusterVersionRef:
        f:componentSpecs:
          .:
          k:{"name":"redis"}:
            .:
            f:componentDefRef:
            f:monitor:
            f:name:
            f:replicas:
            f:resources:
              .:
              f:limits:
                .:
                f:cpu:
                f:memory:
              f:requests:
                .:
                f:cpu:
                f:memory:
            f:serviceAccountName:
            f:switchPolicy:
              .:
              f:type:
            f:volumeClaimTemplates:
          k:{"name":"redis-sentinel"}:
            .:
            f:componentDefRef:
            f:monitor:
            f:name:
            f:replicas:
            f:resources:
              .:
              f:limits:
                .:
                f:cpu:
                f:memory:
              f:requests:
                .:
                f:cpu:
                f:memory:
            f:serviceAccountName:
            f:volumeClaimTemplates:
        f:terminationPolicy:
        f:tolerations:
    Manager:      kubectl-create
    Operation:    Update
    Time:         2023-09-04T08:09:45Z
    API Version:  apps.kubeblocks.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        .:
        f:clusterDefGeneration:
        f:components:
          .:
          f:redis:
            .:
            f:message:
              .:
              f:Pod/redis-2q8t6-redis-0:
            f:phase:
            f:podsReady:
            f:replicationSetStatus:
              .:
              f:primary:
                .:
                f:pod:
          f:redis-sentinel:
            .:
            f:phase:
            f:podsReady:
            f:podsReadyTime:
        f:conditions:
        f:observedGeneration:
        f:phase:
    Manager:      manager
    Operation:    Update
    Subresource:  status
    Time:         2023-09-04T09:11:45Z
    API Version:  apps.kubeblocks.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:kubeblocks.io/reconcile:
        f:finalizers:
          .:
          v:"cluster.kubeblocks.io/finalizer":
    Manager:         manager
    Operation:       Update
    Time:            2023-09-04T09:12:39Z
  Resource Version:  430708
  UID:               02e590da-244b-42f5-b146-507ace6e6d04
Spec:
  Affinity:
    Node Labels:
    Pod Anti Affinity:  Preferred
    Tenancy:            SharedNode
    Topology Keys:
  Cluster Definition Ref:  redis
  Cluster Version Ref:     redis-7.0.6
  Component Specs:
    Component Def Ref:  redis
    Monitor:            true
    Name:               redis
    No Create PDB:      false
    Replicas:           1
    Resources:
      Limits:
        Cpu:     1000m
        Memory:  1024Mi
      Requests:
        Cpu:               100m
        Memory:            102Mi
    Service Account Name:  dbname
    Switch Policy:
      Type:  Noop
    Volume Claim Templates:
      Name:  data
      Spec:
        Access Modes:
          ReadWriteOnce
        Resources:
          Requests:
            Storage:         3Gi
        Storage Class Name:  standard-rwo
    Component Def Ref:       redis-sentinel
    Monitor:                 true
    Name:                    redis-sentinel
    No Create PDB:           false
    Replicas:                1
    Resources:
      Limits:
        Cpu:     100m
        Memory:  100Mi
      Requests:
        Cpu:               100m
        Memory:            100Mi
    Service Account Name:  dbname
    Volume Claim Templates:
      Name:  data
      Spec:
        Access Modes:
          ReadWriteOnce
        Resources:
          Requests:
            Storage:   1Gi
  Termination Policy:  WipeOut
  Tolerations:
Status:
  Cluster Def Generation:  2
  Components:
    Redis:
      Message:
        Pod/redis-2q8t6-redis-0:  back-off 2m40s restarting failed container=redis pod=redis-2q8t6-redis-0_default(800d96ed-aa47-4205-80d1-f3f202ecb606)
      Phase:                      Failed
      Pods Ready:                 false
      Replication Set Status:
        Primary:
          Pod:  redis-2q8t6-redis-0
    Redis - Sentinel:
      Phase:            Running
      Pods Ready:       true
      Pods Ready Time:  2023-09-04T08:10:44Z
  Conditions:
    Last Transition Time:  2023-09-04T08:10:03Z
    Message:               The operator has started the provisioning of Cluster: redis-2q8t6
    Observed Generation:   1
    Reason:                PreCheckSucceed
    Status:                True
    Type:                  ProvisioningStarted
    Last Transition Time:  2023-09-04T08:10:03Z
    Message:               Successfully applied for resources
    Observed Generation:   1
    Reason:                ApplyResourcesSucceed
    Status:                True
    Type:                  ApplyResources
    Last Transition Time:  2023-09-04T09:11:44Z
    Message:               pods are not ready in Components: [redis], refer to related component message in Cluster.status.components
    Reason:                ReplicasNotReady
    Status:                False
    Type:                  ReplicasReady
    Last Transition Time:  2023-09-04T09:11:44Z
    Message:               pods are unavailable in Components: [redis], refer to related component message in Cluster.status.components
    Reason:                ComponentsNotReady
    Status:                False
    Type:                  Ready
  Observed Generation:     1
  Phase:                   Failed
Events:
  Type     Reason                    Age                From                Message
  ----     ------                    ----               ----                -------
  Warning  Unhealthy                 24m                event-controller    Pod redis-2q8t6-redis-sentinel-0: Readiness probe errored: command "sh -c /scripts/redis-sentinel-ping.sh 1" timed out
  Normal   WaitingForProbeSuccess    16m                cluster-controller  Waiting for probe success
  Normal   ComponentPhaseTransition  91s                cluster-controller  Running: false, PodsReady: false, PodsTimedout: true
  Warning  ReplicasNotReady          89s                cluster-controller  pods are not ready in Components: [redis], refer to related component message in Cluster.status.components
  Warning  ComponentsNotReady        89s                cluster-controller  pods are unavailable in Components: [redis], refer to related component message in Cluster.status.components
  Warning  Failed                    89s                cluster-controller  Cluster: redis-2q8t6 is Failed, check according to the components message
  Warning  BackOff                   35s (x6 over 10m)  event-controller    Pod redis-2q8t6-redis-0: Back-off restarting failed container redis in pod redis-2q8t6-redis-0_default(800d96ed-aa47-4205-80d1-f3f202ecb606)

➜  ~ k describe pod redis-2q8t6-redis-0
Name:         redis-2q8t6-redis-0
Namespace:    default
Priority:     0
Node:         gke-cluster-1-default-pool-56d1d4e0-fhtn/10.128.0.51
Start Time:   Mon, 04 Sep 2023 16:56:44 +0800
Labels:       app.kubernetes.io/component=redis
              app.kubernetes.io/instance=redis-2q8t6
              app.kubernetes.io/managed-by=kubeblocks
              app.kubernetes.io/name=redis
              app.kubernetes.io/version=redis-7.0.6
              apps.kubeblocks.io/component-name=redis
              apps.kubeblocks.io/workload-type=Replication
              controller-revision-hash=redis-2q8t6-redis-5659cb9975
              kubeblocks.io/role=primary
              statefulset.kubernetes.io/pod-name=redis-2q8t6-redis-0
Annotations:  apps.kubeblocks.io/component-replicas: 1
              apps.kubeblocks.io/last-role-changed-event-timestamp: 2023-09-04T08:58:39Z
Status:       Running
IP:           10.4.2.201
IPs:
  IP:           10.4.2.201
Controlled By:  StatefulSet/redis-2q8t6-redis
Containers:
  redis:
    Container ID:  containerd://05b607d8f80cecb3a9ab041e8498e1be3caa03c523938d578a13aa1bc89d777e
    Image:         registry.cn-hangzhou.aliyuncs.com/apecloud/redis-stack-server:7.0.6-RC8
    Image ID:      docker.io/redis/redis-stack-server@sha256:511808b267ab8d800283604ef5c01f4fe94792bfb746bb6dba236cc29ff5495b
    Port:          6379/TCP
    Host Port:     0/TCP
    Command:
      /scripts/redis-start.sh
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 04 Sep 2023 17:09:10 +0800
      Finished:     Mon, 04 Sep 2023 17:10:53 +0800
    Ready:          False
    Restart Count:  5
    Limits:
      cpu:     1
      memory:  1Gi
    Requests:
      cpu:      100m
      memory:   102Mi
    Readiness:  exec [sh -c /scripts/redis-ping.sh 1] delay=10s timeout=1s period=5s #success=1 #failure=5
    Environment Variables from:
      redis-2q8t6-redis-env  ConfigMap  Optional: false
    Environment:
      KB_POD_NAME:               redis-2q8t6-redis-0 (v1:metadata.name)
      KB_POD_UID:                 (v1:metadata.uid)
      KB_NAMESPACE:              default (v1:metadata.namespace)
      KB_SA_NAME:                 (v1:spec.serviceAccountName)
      KB_NODENAME:                (v1:spec.nodeName)
      KB_HOST_IP:                 (v1:status.hostIP)
      KB_POD_IP:                  (v1:status.podIP)
      KB_POD_IPS:                 (v1:status.podIPs)
      KB_HOSTIP:                  (v1:status.hostIP)
      KB_PODIP:                   (v1:status.podIP)
      KB_PODIPS:                  (v1:status.podIPs)
      KB_CLUSTER_NAME:           redis-2q8t6
      KB_COMP_NAME:              redis
      KB_CLUSTER_COMP_NAME:      redis-2q8t6-redis
      KB_CLUSTER_UID_POSTFIX_8:  ce6e6d04
      KB_POD_FQDN:               $(KB_POD_NAME).$(KB_CLUSTER_COMP_NAME)-headless.$(KB_NAMESPACE).svc
      REDIS_REPL_USER:           kbreplicator
      REDIS_REPL_PASSWORD:       <set to the key 'password' in secret 'redis-2q8t6-conn-credential'>  Optional: false
      REDIS_DEFAULT_PASSWORD:    <set to the key 'password' in secret 'redis-2q8t6-conn-credential'>  Optional: false
      REDIS_SENTINEL_USER:       $(REDIS_REPL_USER)-sentinel
      REDIS_SENTINEL_PASSWORD:   <set to the key 'password' in secret 'redis-2q8t6-conn-credential'>  Optional: false
      REDIS_ARGS:                --requirepass $(REDIS_PASSWORD)
    Mounts:
      /data from data (rw)
      /etc/conf from redis-config (rw)
      /etc/redis from redis-conf (rw)
      /kb-podinfo from pod-info (rw)
      /scripts from scripts (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-5wjmb (ro)
  metrics:
    Container ID:  containerd://90def8238d455240d149adb04e75a01aaa3434c93d6bf1454b28ddb48487e633
    Image:         registry.cn-hangzhou.aliyuncs.com/apecloud/agamotto:0.1.2-beta.1
    Image ID:      registry.cn-hangzhou.aliyuncs.com/apecloud/agamotto@sha256:cbab349b90490807a8d5039bf01bc7e37334f20c98c7dd75bc7fc4cf9e5b10ee
    Port:          9121/TCP
    Host Port:     0/TCP
    Command:
      /bin/agamotto
      --config=/opt/conf/metrics-config.yaml
    State:          Running
      Started:      Mon, 04 Sep 2023 16:57:36 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:     0
      memory:  0
    Environment Variables from:
      redis-2q8t6-redis-env  ConfigMap  Optional: false
    Environment:
      KB_POD_NAME:               redis-2q8t6-redis-0 (v1:metadata.name)
      KB_POD_UID:                 (v1:metadata.uid)
      KB_NAMESPACE:              default (v1:metadata.namespace)
      KB_SA_NAME:                 (v1:spec.serviceAccountName)
      KB_NODENAME:                (v1:spec.nodeName)
      KB_HOST_IP:                 (v1:status.hostIP)
      KB_POD_IP:                  (v1:status.podIP)
      KB_POD_IPS:                 (v1:status.podIPs)
      KB_HOSTIP:                  (v1:status.hostIP)
      KB_PODIP:                   (v1:status.podIP)
      KB_PODIPS:                  (v1:status.podIPs)
      KB_CLUSTER_NAME:           redis-2q8t6
      KB_COMP_NAME:              redis
      KB_CLUSTER_COMP_NAME:      redis-2q8t6-redis
      KB_CLUSTER_UID_POSTFIX_8:  ce6e6d04
      KB_POD_FQDN:               $(KB_POD_NAME).$(KB_CLUSTER_COMP_NAME)-headless.$(KB_NAMESPACE).svc
      ENDPOINT:                  localhost:6379
      REDIS_USER:                <set to the key 'username' in secret 'redis-2q8t6-conn-credential'>  Optional: false
      REDIS_PASSWORD:            <set to the key 'password' in secret 'redis-2q8t6-conn-credential'>  Optional: false
    Mounts:
      /opt/conf from redis-metrics-config (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-5wjmb (ro)
  kb-checkrole:
    Container ID:  containerd://f2a847808bcc02a974074e34944f8857b098c546558b6c3306f91a4bad05a817
    Image:         registry.cn-hangzhou.aliyuncs.com/apecloud/kubeblocks-tools:0.6.2-beta.1
    Image ID:      registry.cn-hangzhou.aliyuncs.com/apecloud/kubeblocks-tools@sha256:6760e31e4094714bbe126134d43d19c7d4bd82c1619695908c1c6c696474cb9f
    Ports:         3501/TCP, 50001/TCP
    Host Ports:    0/TCP, 0/TCP
    Command:
      probe
      --app-id
      batch-sdk
      --dapr-http-port
      3501
      --dapr-grpc-port
      50001
      --log-level
      info
      --config
      /config/probe/config.yaml
      --components-path
      /config/probe/components
    State:          Running
      Started:      Mon, 04 Sep 2023 16:57:38 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:      0
      memory:   0
    Readiness:  http-get http://:3501/v1.0/bindings/redis%3Foperation=checkRole&workloadType=Replication delay=0s timeout=1s period=2s #success=1 #failure=2
    Startup:    tcp-socket :3501 delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment Variables from:
      redis-2q8t6-redis-env  ConfigMap  Optional: false
    Environment:
      KB_POD_NAME:                redis-2q8t6-redis-0 (v1:metadata.name)
      KB_POD_UID:                  (v1:metadata.uid)
      KB_NAMESPACE:               default (v1:metadata.namespace)
      KB_SA_NAME:                  (v1:spec.serviceAccountName)
      KB_NODENAME:                 (v1:spec.nodeName)
      KB_HOST_IP:                  (v1:status.hostIP)
      KB_POD_IP:                   (v1:status.podIP)
      KB_POD_IPS:                  (v1:status.podIPs)
      KB_HOSTIP:                   (v1:status.hostIP)
      KB_PODIP:                    (v1:status.podIP)
      KB_PODIPS:                   (v1:status.podIPs)
      KB_CLUSTER_NAME:            redis-2q8t6
      KB_COMP_NAME:               redis
      KB_CLUSTER_COMP_NAME:       redis-2q8t6-redis
      KB_CLUSTER_UID_POSTFIX_8:   ce6e6d04
      KB_POD_FQDN:                $(KB_POD_NAME).$(KB_CLUSTER_COMP_NAME)-headless.$(KB_NAMESPACE).svc
      KB_SERVICE_USER:            <set to the key 'username' in secret 'redis-2q8t6-conn-credential'>  Optional: false
      KB_SERVICE_PASSWORD:        <set to the key 'password' in secret 'redis-2q8t6-conn-credential'>  Optional: false
      KB_SERVICE_PORT:            6379
      KB_SERVICE_ROLES:           {}
      KB_SERVICE_CHARACTER_TYPE:  redis
      KB_WORKLOAD_TYPE:           Replication
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-5wjmb (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data-redis-2q8t6-redis-0
    ReadOnly:   false
  pod-info:
    Type:  DownwardAPI (a volume populated by information about the pod)
    Items:
      metadata.labels['kubeblocks.io/role'] -> pod-role
      metadata.annotations['rs.apps.kubeblocks.io/primary'] -> primary-pod
      metadata.annotations['apps.kubeblocks.io/component-replicas'] -> component-replicas
  redis-metrics-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      redis-2q8t6-redis-redis-metrics-config
    Optional:  false
  redis-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      redis-2q8t6-redis-redis-replication-config
    Optional:  false
  scripts:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      redis-2q8t6-redis-redis-scripts
    Optional:  false
  redis-conf:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  kube-api-access-5wjmb:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 kb-data=true:NoSchedule
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                  Age                   From                     Message
  ----     ------                  ----                  ----                     -------
  Warning  FailedAttachVolume      14m                   attachdetach-controller  Multi-Attach error for volume "pvc-af1ac4d1-0247-4e62-9e70-41c3a37aebf2" Volume is already exclusively attached to one node and can't be attached to another
  Normal   Scheduled               14m                   default-scheduler        Successfully assigned default/redis-2q8t6-redis-0 to gke-cluster-1-default-pool-56d1d4e0-fhtn
  Normal   SuccessfulAttachVolume  14m                   attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-af1ac4d1-0247-4e62-9e70-41c3a37aebf2"
  Normal   Pulled                  13m                   kubelet                  Container image "registry.cn-hangzhou.aliyuncs.com/apecloud/agamotto:0.1.2-beta.1" already present on machine
  Normal   Created                 13m                   kubelet                  Created container metrics
  Normal   Created                 13m                   kubelet                  Created container kb-checkrole
  Normal   Pulled                  13m                   kubelet                  Container image "registry.cn-hangzhou.aliyuncs.com/apecloud/kubeblocks-tools:0.6.2-beta.1" already present on machine
  Normal   Started                 13m                   kubelet                  Started container metrics
  Normal   Started                 13m                   kubelet                  Started container kb-checkrole
  Normal   checkRole               13m                   sqlchannel               {"event":"Failed","message":"role check delay","operation":"checkRole","originalRole":""}
  Normal   checkRole               12m                   sqlchannel               {"event":"Success","operation":"checkRole","originalRole":"","role":"primary"}
  Normal   checkRole               12m                   sqlchannel               {"event":"Failed","message":"dial tcp 127.0.0.1:6379: connect: connection refused","operation":"checkRole","originalRole":"primary"}
  Normal   checkRole               10m                   sqlchannel               {"event":"Failed","message":"dial tcp 127.0.0.1:6379: connect: connection refused","operation":"checkRole","originalRole":"primary"}
  Warning  Unhealthy               8m30s                 kubelet                  Readiness probe errored: rpc error: code = NotFound desc = failed to exec in container: failed to load task: no running task found: task 26e57b3fdbdd486d0e7adc2de42fb8708d45fbbab9423c9e048a207045c762dc not found: not found
  Normal   checkRole               8m28s                 sqlchannel               {"event":"Failed","message":"dial tcp 127.0.0.1:6379: connect: connection refused","operation":"checkRole","originalRole":"primary"}
  Normal   Pulled                  7m58s (x4 over 14m)   kubelet                  Container image "registry.cn-hangzhou.aliyuncs.com/apecloud/redis-stack-server:7.0.6-RC8" already present on machine
  Normal   Created                 7m58s (x4 over 14m)   kubelet                  Created container redis
  Normal   Started                 7m56s (x4 over 13m)   kubelet                  Started container redis
  Normal   checkRole               6m12s                 sqlchannel               {"event":"Failed","message":"dial tcp 127.0.0.1:6379: connect: connection refused","operation":"checkRole","originalRole":"primary"}
  Normal   checkRole               3m46s                 sqlchannel               {"event":"Failed","message":"dial tcp 127.0.0.1:6379: connect: connection refused","operation":"checkRole","originalRole":"primary"}
  Warning  BackOff                 3m45s (x10 over 10m)  kubelet                  Back-off restarting failed container redis in pod redis-2q8t6-redis-0_default(800d96ed-aa47-4205-80d1-f3f202ecb606)
  Normal   checkRole               36s                   sqlchannel               {"event":"Failed","message":"dial tcp 127.0.0.1:6379: connect: connection refused","operation":"checkRole","originalRole":"primary"}
➜  ~ k logs redis-2q8t6-redis-0
Defaulted container "redis" out of: redis, metrics, kb-checkrole
+ echo include /etc/conf/redis.conf
+ echo replica-announce-ip redis-2q8t6-redis-0.redis-2q8t6-redis-headless.default.svc
+ [ -f /data/users.acl ]
+ sed -i /user default on/d /data/users.acl
+ sed -i /user kbreplicator on/d /data/users.acl
+ sed -i /user kbreplicator-sentinel on/d /data/users.acl
+ [ ! -z  ]
+ [ ! -z  ]
+ [ ! -z  ]
+ echo protected-mode no
+ echo aclfile /data/users.acl
+ start_redis_server
+ exec redis-server /etc/redis/redis.conf --loadmodule /opt/redis-stack/lib/redisearch.so --loadmodule /opt/redis-stack/lib/redisgraph.so+  --loadmodule /opt/redis-stack/lib/redistimeseries.so --loadmodule /opt/redis-stack/lib/rejson.so --loadmodule /opt/redis-stack/lib/redisbloom.so
create_replication
+ [ ! -z  ]
+ retry redis-cli -h 127.0.0.1 -p 6379 ping
+ local max_attempts=20
+ local attempt=1
+ redis-cli -h 127.0.0.1 -p 6379 ping
Could not connect to Redis at 127.0.0.1:6379: Connection refused
Command 'redis-cli -h 127.0.0.1 -p 6379 ping' failed. Attempt 1 of 20. Retrying in 5 seconds...
+ [ 1 -eq 20 ]
+ echo Command 'redis-cli -h 127.0.0.1 -p 6379 ping' failed. Attempt 1 of 20. Retrying in 5 seconds...
+ attempt=2
+ sleep 3
+ redis-cli -h 127.0.0.1 -p 6379 ping
PONG
+ [ 2 -eq 20 ]
+ attempt=1
+ max_attempts=20
+ [ 1 -le 20 ]
+ cat /kb-podinfo/primary-pod
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 1 of 20...
+ sleep 5
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 1 of 20...
+ attempt=2
+ [ 2 -le 20 ]
+ cat /kb-podinfo/primary-pod
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 2 of 20...
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 2 of 20...
+ sleep 5
+ attempt=3
+ [ 3 -le 20 ]
+ cat /kb-podinfo/primary-pod
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 3 of 20...
+ sleep 5
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 3 of 20...
+ attempt=4
+ [ 4 -le 20 ]
+ cat /kb-podinfo/primary-pod
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 4 of 20...
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 4 of 20...
+ sleep 5
+ attempt=5
+ [ 5 -le 20 ]
+ cat /kb-podinfo/primary-pod
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 5 of 20...
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 5 of 20...
+ sleep 5
+ attempt=6
+ [ 6 -le 20 ]
+ cat /kb-podinfo/primary-pod
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 6 of 20...
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 6 of 20...
+ sleep 5
+ attempt=7
+ [ 7 -le 20 ]
+ cat /kb-podinfo/primary-pod
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 7 of 20...
+ sleep 5
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 7 of 20...
+ attempt=8
+ [ 8 -le 20 ]
+ cat /kb-podinfo/primary-pod
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 8 of 20...
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 8 of 20...
+ sleep 5
+ attempt=9
+ [ 9 -le 20 ]
+ cat /kb-podinfo/primary-pod
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 9 of 20...
+ sleep 5
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 9 of 20...
+ attempt=10
+ [ 10 -le 20 ]
+ cat /kb-podinfo/primary-pod
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 10 of 20...
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 10 of 20...
+ sleep 5
+ attempt=11
+ [ 11 -le 20 ]
+ cat /kb-podinfo/primary-pod
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 11 of 20...
+ sleep 5
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 11 of 20...
+ attempt=12
+ [ 12 -le 20 ]
+ cat /kb-podinfo/primary-pod
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 12 of 20...
+ sleep 5
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 12 of 20...
+ attempt=13
+ [ 13 -le 20 ]
+ cat /kb-podinfo/primary-pod
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 13 of 20...
+ sleep 5
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 13 of 20...
+ attempt=14
+ [ 14 -le 20 ]
+ cat /kb-podinfo/primary-pod
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 14 of 20...
+ sleep 5
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 14 of 20...
+ attempt=15
+ [ 15 -le 20 ]
+ cat /kb-podinfo/primary-pod
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 15 of 20...
+ sleep 5
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 15 of 20...
+ attempt=16
+ [ 16 -le 20 ]
+ cat /kb-podinfo/primary-pod
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 16 of 20...
+ sleep 5
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 16 of 20...
+ attempt=17
+ [ 17 -le 20 ]
+ cat /kb-podinfo/primary-pod
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 17 of 20...
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 17 of 20...
+ sleep 5
+ attempt=18
+ [ 18 -le 20 ]
+ cat /kb-podinfo/primary-pod
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 18 of 20...
+ sleep 5
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 18 of 20...
+ attempt=19
+ [ 19 -le 20 ]
+ cat /kb-podinfo/primary-pod
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 19 of 20...
+ sleep 5
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 19 of 20...
+ attempt=20
+ [ 20 -le 20 ]
+ cat /kb-podinfo/primary-pod
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 20 of 20...
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 20 of 20...
+ sleep 5
+ attempt=21
+ [ 21 -le 20 ]
+ cat /kb-podinfo/primary-pod
+ primary=
+ echo DownwardAPI get primary=
+ echo KB_POD_NAME=redis-2q8t6-redis-0
Primary pod information not available. shutdown redis-server...
+ [ -z  ]
+ echo Primary pod information not available. shutdown redis-server...
+ [ ! -z  ]
+ redis-cli -h 127.0.0.1 -p 6379 shutdown

➜ ~ k logs kubeblocks-86c89fcc67-zqwg4 -n kb-system >kblog.txt
Defaulted container "manager" out of: manager, tools (init), datascript (init)
kblog.txt

@ahjing99 ahjing99 added the kind/bug Something isn't working label Sep 4, 2023
@ahjing99 ahjing99 added this to the Release 0.7.0 milestone Sep 4, 2023
@linghan-hub
Copy link
Collaborator

After we successfully created the redis cluster, we executed an sql to connect to the database with the same problem and simply disconnected the connection. We found that the cluster status is Updating and the pod status CrashLoopBackOff

  1. create cluster
---
# Source: redis-cluster/templates/rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: kb-redis-cluster
  namespace: default
  labels:
    helm.sh/chart: redis-cluster-0.7.0-alpha.0
    app.kubernetes.io/version: "7.0.6"
    app.kubernetes.io/instance: redis-cluster
---
# Source: redis-cluster/templates/rbac.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: kb-redis-cluster
  labels:
    helm.sh/chart: redis-cluster-0.7.0-alpha.0
    app.kubernetes.io/version: "7.0.6"
    app.kubernetes.io/instance: redis-cluster
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kubeblocks-volume-protection-pod-role
subjects:
  - kind: ServiceAccount
    name: kb-redis-cluster
    namespace: default
---
# Source: redis-cluster/templates/rbac.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: kb-redis-cluster
  labels:
    helm.sh/chart: redis-cluster-0.7.0-alpha.0
    app.kubernetes.io/version: "7.0.6"
    app.kubernetes.io/instance: redis-cluster
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kubeblocks-cluster-pod-role
subjects:
  - kind: ServiceAccount
    name: kb-redis-cluster
    namespace: default
---
# Source: redis-cluster/templates/cluster.yaml
apiVersion: apps.kubeblocks.io/v1alpha1
kind: Cluster
metadata:
  name: redis-cluster
  namespace: default
  labels: 
    helm.sh/chart: redis-cluster-0.7.0-alpha.0
    app.kubernetes.io/version: "7.0.6"
    app.kubernetes.io/instance: redis-cluster
spec:
  clusterVersionRef: redis-7.0.6
  terminationPolicy: Delete  
  affinity:
    podAntiAffinity: Preferred
    topologyKeys:
      - kubernetes.io/hostname
    tenancy: SharedNode
  clusterDefinitionRef: redis  # ref clusterDefinition.name
  componentSpecs:
    - name: redis
      componentDefRef: redis # ref clusterDefinition componentDefs.name      
      monitor: false      
      replicas: 1
      enabledLogs:
        - running
      serviceAccountName: kb-redis-cluster
      switchPolicy:
        type: Noop      
      resources:
        limits:
          cpu: "0.5"
          memory: "0.5Gi"
        requests:
          cpu: "0.5"
          memory: "0.5Gi"      
      volumeClaimTemplates:
        - name: data # ref clusterDefinition components.containers.volumeMounts.name
          spec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 20Gi      
      services:
  1. connect cluster
kbcli cluster connect redis-cluster
Connect to instance redis-cluster-redis-0
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
[127.0.0.1:6379](http://127.0.0.1:6379/)> config get rdbcompression
1) "rdbcompression"
2) "yes"
[127.0.0.1:6379](http://127.0.0.1:6379/)> command terminated with exit code 137
➜  dbass git:(main) ✗ k get cluster
NAME            CLUSTER-DEFINITION   VERSION           TERMINATION-POLICY   STATUS     AGE
mysql-cluster   apecloud-mysql       ac-mysql-8.0.30   Delete               Running    59m
redis-cluster   redis                redis-7.0.6       Delete               Updating   2m3s
➜  dbass git:(main) ✗ k get pod
NAME                    READY   STATUS    RESTARTS      AGE
mysql-cluster-mysql-0   8/8     Running   0             59m
redis-cluster-redis-0   5/5     Running   1 (13s ago)   2m9s
➜  dbass git:(main) ✗ k get cluster
NAME            CLUSTER-DEFINITION   VERSION           TERMINATION-POLICY   STATUS     AGE
mysql-cluster   apecloud-mysql       ac-mysql-8.0.30   Delete               Running    61m
redis-cluster   redis                redis-7.0.6       Delete               Updating   3m41s
➜  dbass git:(main) ✗ k get pod
NAME                    READY   STATUS             RESTARTS     AGE
mysql-cluster-mysql-0   8/8     Running            0            61m
redis-cluster-redis-0   4/5     CrashLoopBackOff   1 (9s ago)   3m45s
  1. see logs
k describe pod redis-cluster-redis-0
Name:             redis-cluster-redis-0
Namespace:        default
Priority:         0
Service Account:  kb-redis-cluster
Node:             ip-10-0-3-15.us-west-2.compute.internal/10.0.3.15
Start Time:       Mon, 25 Sep 2023 15:49:35 +0800
Labels:           app.kubernetes.io/component=redis
                  app.kubernetes.io/instance=redis-cluster
                  app.kubernetes.io/managed-by=kubeblocks
                  app.kubernetes.io/name=redis
                  app.kubernetes.io/version=redis-7.0.6
                  apps.kubeblocks.io/component-name=redis
                  apps.kubeblocks.io/workload-type=Replication
                  controller-revision-hash=redis-cluster-redis-765b949b
                  kubeblocks.io/role=primary
                  rsm.workloads.kubeblocks.io/access-mode=ReadWrite
                  statefulset.kubernetes.io/pod-name=redis-cluster-redis-0
Annotations:      apps.kubeblocks.io/component-replicas: 1
                  apps.kubeblocks.io/last-role-snapshot-version: 2023-09-25T07:55:42.899242Z
Status:           Running
IP:               10.0.3.36
IPs:
  IP:           10.0.3.36
Controlled By:  StatefulSet/redis-cluster-redis
Init Containers:
  role-agent-installer:
    Container ID:  containerd://b90d16bf26fc9dfe171c45daba61b2b440c4e3000e9033496828da50763c8303
    Image:         msoap/shell2http:1.16.0
    Image ID:      docker.io/msoap/shell2http@sha256:a20bdde2f679de2cba6bf3d9f470489c7836d4d0d28232a2b295450809cd43ef
    Port:          <none>
    Host Port:     <none>
    Command:
      cp
      /app/shell2http
      /role-probe/agent
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 25 Sep 2023 15:49:43 +0800
      Finished:     Mon, 25 Sep 2023 15:49:43 +0800
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /role-probe from role-agent (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-29l7t (ro)
Containers:
  redis:
    Container ID:  containerd://765d51960d9dc58aca819019a0257895864c850fd76bed995c7399799fe3ff47
    Image:         registry.cn-hangzhou.aliyuncs.com/apecloud/redis-stack-server:7.0.6-RC8
    Image ID:      registry.cn-hangzhou.aliyuncs.com/apecloud/redis-stack-server@sha256:511808b267ab8d800283604ef5c01f4fe94792bfb746bb6dba236cc29ff5495b
    Port:          6379/TCP
    Host Port:     0/TCP
    Command:
      /scripts/redis-start.sh
    State:          Running
      Started:      Mon, 25 Sep 2023 15:55:41 +0800
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 25 Sep 2023 15:53:25 +0800
      Finished:     Mon, 25 Sep 2023 15:55:08 +0800
    Ready:          True
    Restart Count:  3
    Limits:
      cpu:     500m
      memory:  512Mi
    Requests:
      cpu:      500m
      memory:   512Mi
    Readiness:  exec [sh -c /scripts/redis-ping.sh 1] delay=10s timeout=1s period=5s #success=1 #failure=5
    Environment Variables from:
      redis-cluster-redis-env      ConfigMap  Optional: false
      redis-cluster-redis-rsm-env  ConfigMap  Optional: false
    Environment:
      KB_POD_NAME:               redis-cluster-redis-0 (v1:metadata.name)
      KB_POD_UID:                 (v1:metadata.uid)
      KB_NAMESPACE:              default (v1:metadata.namespace)
      KB_SA_NAME:                 (v1:spec.serviceAccountName)
      KB_NODENAME:                (v1:spec.nodeName)
      KB_HOST_IP:                 (v1:status.hostIP)
      KB_POD_IP:                  (v1:status.podIP)
      KB_POD_IPS:                 (v1:status.podIPs)
      KB_HOSTIP:                  (v1:status.hostIP)
      KB_PODIP:                   (v1:status.podIP)
      KB_PODIPS:                  (v1:status.podIPs)
      KB_CLUSTER_NAME:           redis-cluster
      KB_COMP_NAME:              redis
      KB_CLUSTER_COMP_NAME:      redis-cluster-redis
      KB_CLUSTER_UID_POSTFIX_8:  632eba18
      KB_POD_FQDN:               $(KB_POD_NAME).$(KB_CLUSTER_COMP_NAME)-headless.$(KB_NAMESPACE).svc
      REDIS_REPL_USER:           kbreplicator
      REDIS_REPL_PASSWORD:       <set to the key 'password' in secret 'redis-cluster-conn-credential'>  Optional: false
      REDIS_DEFAULT_PASSWORD:    <set to the key 'password' in secret 'redis-cluster-conn-credential'>  Optional: false
      REDIS_SENTINEL_USER:       $(REDIS_REPL_USER)-sentinel
      REDIS_SENTINEL_PASSWORD:   <set to the key 'password' in secret 'redis-cluster-conn-credential'>  Optional: false
      REDIS_ARGS:                --requirepass $(REDIS_PASSWORD)
    Mounts:
      /data from data (rw)
      /etc/conf from redis-config (rw)
      /etc/redis from redis-conf (rw)
      /kb-podinfo from pod-info (rw)
      /scripts from scripts (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-29l7t (ro)
  metrics:
    Container ID:  containerd://a9cf81c8ba177a6f3171fa4e20c691982ba72077a0603e654932e6df09d3ad79
    Image:         registry.cn-hangzhou.aliyuncs.com/apecloud/agamotto:0.1.2-beta.1
    Image ID:      registry.cn-hangzhou.aliyuncs.com/apecloud/agamotto@sha256:cbab349b90490807a8d5039bf01bc7e37334f20c98c7dd75bc7fc4cf9e5b10ee
    Port:          9121/TCP
    Host Port:     0/TCP
    Command:
      /bin/agamotto
      --config=/opt/conf/metrics-config.yaml
    State:          Running
      Started:      Mon, 25 Sep 2023 15:49:44 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:     0
      memory:  0
    Environment Variables from:
      redis-cluster-redis-env      ConfigMap  Optional: false
      redis-cluster-redis-rsm-env  ConfigMap  Optional: false
    Environment:
      KB_POD_NAME:               redis-cluster-redis-0 (v1:metadata.name)
      KB_POD_UID:                 (v1:metadata.uid)
      KB_NAMESPACE:              default (v1:metadata.namespace)
      KB_SA_NAME:                 (v1:spec.serviceAccountName)
      KB_NODENAME:                (v1:spec.nodeName)
      KB_HOST_IP:                 (v1:status.hostIP)
      KB_POD_IP:                  (v1:status.podIP)
      KB_POD_IPS:                 (v1:status.podIPs)
      KB_HOSTIP:                  (v1:status.hostIP)
      KB_PODIP:                   (v1:status.podIP)
      KB_PODIPS:                  (v1:status.podIPs)
      KB_CLUSTER_NAME:           redis-cluster
      KB_COMP_NAME:              redis
      KB_CLUSTER_COMP_NAME:      redis-cluster-redis
      KB_CLUSTER_UID_POSTFIX_8:  632eba18
      KB_POD_FQDN:               $(KB_POD_NAME).$(KB_CLUSTER_COMP_NAME)-headless.$(KB_NAMESPACE).svc
      ENDPOINT:                  localhost:6379
      REDIS_USER:                <set to the key 'username' in secret 'redis-cluster-conn-credential'>  Optional: false
      REDIS_PASSWORD:            <set to the key 'password' in secret 'redis-cluster-conn-credential'>  Optional: false
    Mounts:
      /opt/conf from redis-metrics-config (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-29l7t (ro)
  kb-we-syncer:
    Container ID:  containerd://ec89909de439a018bbf973f46619b078f1de32768b771e20bba92351587d7a43
    Image:         registry.cn-hangzhou.aliyuncs.com/apecloud/kubeblocks-tools:0.7.0-alpha.16
    Image ID:      registry.cn-hangzhou.aliyuncs.com/apecloud/kubeblocks-tools@sha256:19ba1eb3d920a0c6e90aee6b5a54c544663c27889e5d29a79bc80e6656030297
    Ports:         3501/TCP, 50001/TCP
    Host Ports:    0/TCP, 0/TCP
    Command:
      lorry
      --port
      3501
    State:          Running
      Started:      Mon, 25 Sep 2023 15:49:44 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:     0
      memory:  0
    Startup:   tcp-socket :3501 delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment Variables from:
      redis-cluster-redis-env      ConfigMap  Optional: false
      redis-cluster-redis-rsm-env  ConfigMap  Optional: false
    Environment:
      KB_POD_NAME:                redis-cluster-redis-0 (v1:metadata.name)
      KB_POD_UID:                  (v1:metadata.uid)
      KB_NAMESPACE:               default (v1:metadata.namespace)
      KB_SA_NAME:                  (v1:spec.serviceAccountName)
      KB_NODENAME:                 (v1:spec.nodeName)
      KB_HOST_IP:                  (v1:status.hostIP)
      KB_POD_IP:                   (v1:status.podIP)
      KB_POD_IPS:                  (v1:status.podIPs)
      KB_HOSTIP:                   (v1:status.hostIP)
      KB_PODIP:                    (v1:status.podIP)
      KB_PODIPS:                   (v1:status.podIPs)
      KB_CLUSTER_NAME:            redis-cluster
      KB_COMP_NAME:               redis
      KB_CLUSTER_COMP_NAME:       redis-cluster-redis
      KB_CLUSTER_UID_POSTFIX_8:   632eba18
      KB_POD_FQDN:                $(KB_POD_NAME).$(KB_CLUSTER_COMP_NAME)-headless.$(KB_NAMESPACE).svc
      KB_SERVICE_USER:            <set to the key 'username' in secret 'redis-cluster-conn-credential'>  Optional: false
      KB_SERVICE_PASSWORD:        <set to the key 'password' in secret 'redis-cluster-conn-credential'>  Optional: false
      KB_SERVICE_PORT:            6379
      KB_DATA_PATH:               /data
      KB_SERVICE_CHARACTER_TYPE:  redis
      KB_WORKLOAD_TYPE:           Replication
    Mounts:
      /data from data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-29l7t (ro)
  action-0:
    Container ID:  containerd://ea8e8d886de6697a50f3ebb26819c17220d8b98fe364cf6e2f2a5f31c55ad70a
    Image:         registry.cn-hangzhou.aliyuncs.com/apecloud/redis-stack-server:7.0.6-RC8
    Image ID:      registry.cn-hangzhou.aliyuncs.com/apecloud/redis-stack-server@sha256:511808b267ab8d800283604ef5c01f4fe94792bfb746bb6dba236cc29ff5495b
    Port:          <none>
    Host Port:     <none>
    Command:
      /role-probe/agent
      -port
      36501
      -export-all-vars
      -form
      /role
      Role=$(redis-cli --user $KB_RSM_USERNAME --pass $KB_RSM_PASSWORD --no-auth-warning info | grep role | awk -F ':' '{print $2}' | tr '[:upper:]' '[:lower:]' | tr -d '\r' | tr -d '
      ') && if [ "master" = "$Role" ]; then echo -n "primary"; else echo -n "secondary"; fi
    State:          Running
      Started:      Mon, 25 Sep 2023 15:49:44 +0800
    Ready:          True
    Restart Count:  0
    Environment:
      KB_RSM_USERNAME:  <set to the key 'username' in secret 'redis-cluster-conn-credential'>  Optional: false
      KB_RSM_PASSWORD:  <set to the key 'password' in secret 'redis-cluster-conn-credential'>  Optional: false
    Mounts:
      /role-probe from role-agent (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-29l7t (ro)
  kb-role-probe:
    Container ID:  containerd://7900dac506d8124a33a9e1b6f892288279663bf5f179a5dc6b8fa133a3d6df5c
    Image:         apecloud/kubeblocks-tools:latest
    Image ID:      docker.io/apecloud/kubeblocks-tools@sha256:ca0b5179d21fc74bf598966c9f62b22b070e95f6798bb2de7d533ba3fe2fca9d
    Port:          7373/TCP
    Host Port:     0/TCP
    Command:
      lorry
      --port
      7373
    State:          Running
      Started:      Mon, 25 Sep 2023 15:49:44 +0800
    Ready:          True
    Restart Count:  0
    Readiness:      http-get http://:7373/v1.0/bindings/custom%3Foperation=checkRole delay=0s timeout=1s period=2s #success=1 #failure=2
    Environment:
      KB_RSM_USERNAME:               <set to the key 'username' in secret 'redis-cluster-conn-credential'>  Optional: false
      KB_RSM_PASSWORD:               <set to the key 'password' in secret 'redis-cluster-conn-credential'>  Optional: false
      KB_RSM_ACTION_SVC_LIST:        [36501]
      KB_SERVICE_USER:               <set to the key 'username' in secret 'redis-cluster-conn-credential'>  Optional: false
      KB_SERVICE_PASSWORD:           <set to the key 'password' in secret 'redis-cluster-conn-credential'>  Optional: false
      KB_RSM_SERVICE_PORT:           6379
      KB_SERVICE_PORT:               6379
      KB_RSM_ROLE_UPDATE_MECHANISM:  DirectAPIServerEventUpdate
      KB_POD_NAME:                   redis-cluster-redis-0 (v1:metadata.name)
      KB_NAMESPACE:                  default (v1:metadata.namespace)
      KB_POD_UID:                     (v1:metadata.uid)
      KB_NODENAME:                    (v1:spec.nodeName)
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-29l7t (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data-redis-cluster-redis-0
    ReadOnly:   false
  pod-info:
    Type:  DownwardAPI (a volume populated by information about the pod)
    Items:
      metadata.labels['kubeblocks.io/role'] -> pod-role
      metadata.annotations['rs.apps.kubeblocks.io/primary'] -> primary-pod
      metadata.annotations['apps.kubeblocks.io/component-replicas'] -> component-replicas
  redis-metrics-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      redis-cluster-redis-redis-metrics-config
    Optional:  false
  redis-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      redis-cluster-redis-redis-replication-config
    Optional:  false
  scripts:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      redis-cluster-redis-redis-scripts
    Optional:  false
  redis-conf:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  role-agent:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  kube-api-access-29l7t:
    Type:                     Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:   3607
    ConfigMapName:            kube-root-ca.crt
    ConfigMapOptional:        <nil>
    DownwardAPI:              true
QoS Class:                    Burstable
Node-Selectors:               <none>
Tolerations:                  kb-data=true:NoSchedule
                              node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                              node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Topology Spread Constraints:  kubernetes.io/hostname:ScheduleAnyway when max skew 1 is exceeded for selector app.kubernetes.io/instance=redis-cluster,apps.kubeblocks.io/component-name=redis
Events:
  Type     Reason                  Age                    From                     Message
  ----     ------                  ----                   ----                     -------
  Normal   Scheduled               6m50s                  default-scheduler        Successfully assigned default/redis-cluster-redis-0 to ip-10-0-3-15.us-west-2.compute.internal
  Normal   SuccessfulAttachVolume  6m49s                  attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-49e88858-1284-49c8-8cc6-3817acef3a06"
  Normal   Pulled                  6m42s                  kubelet                  Container image "msoap/shell2http:1.16.0" already present on machine
  Normal   Created                 6m42s                  kubelet                  Created container role-agent-installer
  Normal   Started                 6m42s                  kubelet                  Started container role-agent-installer
  Normal   Created                 6m41s                  kubelet                  Created container kb-role-probe
  Normal   Started                 6m41s                  kubelet                  Started container action-0
  Normal   Started                 6m41s                  kubelet                  Started container kb-role-probe
  Normal   Pulled                  6m41s                  kubelet                  Container image "registry.cn-hangzhou.aliyuncs.com/apecloud/agamotto:0.1.2-beta.1" already present on machine
  Normal   Created                 6m41s                  kubelet                  Created container metrics
  Normal   Started                 6m41s                  kubelet                  Started container metrics
  Normal   Pulled                  6m41s                  kubelet                  Container image "registry.cn-hangzhou.aliyuncs.com/apecloud/kubeblocks-tools:0.7.0-alpha.16" already present on machine
  Normal   Created                 6m41s                  kubelet                  Created container kb-we-syncer
  Normal   Started                 6m41s                  kubelet                  Started container kb-we-syncer
  Normal   Pulled                  6m41s                  kubelet                  Container image "registry.cn-hangzhou.aliyuncs.com/apecloud/redis-stack-server:7.0.6-RC8" already present on machine
  Normal   Created                 6m41s                  kubelet                  Created container action-0
  Normal   Pulled                  6m41s                  kubelet                  Container image "apecloud/kubeblocks-tools:latest" already present on machine
  Normal   checkRole               6m39s                  sqlchannel               {"event":"Success","operation":"checkRole","originalRole":"","role":"primary"}
  Normal   Created                 4m58s (x2 over 6m41s)  kubelet                  Created container redis
  Normal   Pulled                  4m58s (x2 over 6m41s)  kubelet                  Container image "registry.cn-hangzhou.aliyuncs.com/apecloud/redis-stack-server:7.0.6-RC8" already present on machine
  Normal   Started                 4m58s (x2 over 6m41s)  kubelet                  Started container redis
  Warning  Unhealthy               3m18s                  kubelet                  Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: container is in CONTAINER_EXITED state
  Normal   checkRole               3m16s                  sqlchannel               {"event":"Success","operation":"checkRole","originalRole":"primary","role":"secondary"}
  Normal   checkRole               2m58s                  sqlchannel               {"event":"Success","operation":"checkRole","originalRole":"secondary","role":"primary"}
  Warning  BackOff                 77s (x4 over 3m17s)    kubelet                  Back-off restarting failed container
  Normal   checkRole               76s                    sqlchannel               {"event":"Success","operation":"checkRole","originalRole":"primary","role":"secondary"}
  Normal   checkRole               42s                    sqlchannel               {"event":"Success","operation":"checkRole","originalRole":"secondary","role":"primary"}
➜  ~ git:(main) ✗ k get pod
NAME                    READY   STATUS    RESTARTS       AGE
mysql-cluster-mysql-0   8/8     Running   0              65m
redis-cluster-redis-0   5/5     Running   3 (104s ago)   7m21s
➜  ~ git:(main) ✗ k logs redis-cluster-redis-0
Defaulted container "redis" out of: redis, metrics, kb-we-syncer, action-0, kb-role-probe, role-agent-installer (init)
+ echo include /etc/conf/redis.conf
+ echo replica-announce-ip redis-cluster-redis-0.redis-cluster-redis-headless.default.svc
+ [ -f /data/users.acl ]
+ sed -i /user default on/d /data/users.acl
+ sed -i /user kbreplicator on/d /data/users.acl
+ sed -i /user kbreplicator-sentinel on/d /data/users.acl
+ [ ! -z dqr6jqvn ]
+ echo masteruser kbreplicator
+ echo masterauth dqr6jqvn
+ echo user kbreplicator on +psync +replconf +ping >dqr6jqvn
+ [ ! -z dqr6jqvn ]
+ echo user kbreplicator-sentinel on allchannels +multi +slaveof +ping +exec +subscribe +config|rewrite +role +publish +info +client|setname +client|kill +script|kill >dqr6jqvn
+ [ ! -z dqr6jqvn ]
+ echo protected-mode yes
+ echo user default on allcommands allkeys >dqr6jqvn
+ echo aclfile /data/users.acl
+ start_redis_server
+ exec redis-server /etc/redis/redis.conf --loadmodule /opt/redis-stack/lib/redisearch.so --loadmodule /opt/redis-stack/lib/redisgraph.so --loadmodule /opt/redis-stack/lib/redistimeseries.so+  --loadmodule /opt/redis-stack/lib/rejson.so --loadmodule /opt/redis-stack/lib/redisbloom.so
create_replication
+ [ ! -z dqr6jqvn ]
+ retry redis-cli -h 127.0.0.1 -p 6379 -a dqr6jqvn ping
+ local max_attempts=20
+ local attempt=1
+ redis-cli -h 127.0.0.1 -p 6379 -a dqr6jqvn ping
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
Could not connect to Redis at 127.0.0.1:6379: Connection refused
+ [ 1 -eq 20 ]
+ echo Command 'redis-cli -h 127.0.0.1 -p 6379 -a dqr6jqvn ping' failed. Attempt 1 of 20. Retrying in 5 seconds...
+ attempt=2
+ sleep 3
Command 'redis-cli -h 127.0.0.1 -p 6379 -a dqr6jqvn ping' failed. Attempt 1 of 20. Retrying in 5 seconds...
+ redis-cli -h 127.0.0.1 -p 6379 -a dqr6jqvn ping
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
PONG
+ [ 2 -eq 20 ]
+ attempt=1
+ max_attempts=20
+ [ 1 -le 20 ]
+ cat /kb-podinfo/primary-pod
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 1 of 20...
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 1 of 20...
+ sleep 5
+ attempt=2
+ [ 2 -le 20 ]
+ cat /kb-podinfo/primary-pod
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 2 of 20...
+ sleep 5
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 2 of 20...
+ attempt=3
+ [ 3 -le 20 ]
+ cat /kb-podinfo/primary-pod
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 3 of 20...
+ sleep 5
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 3 of 20...
+ attempt=4
+ [ 4 -le 20 ]
+ cat /kb-podinfo/primary-pod
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 4 of 20...
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 4 of 20...
+ sleep 5
+ attempt=5
+ [ 5 -le 20 ]
+ cat /kb-podinfo/primary-pod
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 5 of 20...
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 5 of 20...
+ sleep 5
+ attempt=6
+ [ 6 -le 20 ]
+ cat /kb-podinfo/primary-pod
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 6 of 20...
+ sleep 5
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 6 of 20...
+ attempt=7
+ [ 7 -le 20 ]
+ cat /kb-podinfo/primary-pod
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 7 of 20...
+ sleep 5
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 7 of 20...
+ attempt=8
+ [ 8 -le 20 ]
+ cat /kb-podinfo/primary-pod
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 8 of 20...
+ sleep 5
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 8 of 20...
+ attempt=9
+ [ 9 -le 20 ]
+ cat /kb-podinfo/primary-pod
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 9 of 20...
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 9 of 20...
+ sleep 5
+ attempt=10
+ [ 10 -le 20 ]
+ cat /kb-podinfo/primary-pod
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 10 of 20...
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 10 of 20...
+ sleep 5
+ attempt=11
+ [ 11 -le 20 ]
+ cat /kb-podinfo/primary-pod
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 11 of 20...
+ sleep 5
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 11 of 20...
+ attempt=12
+ [ 12 -le 20 ]
+ cat /kb-podinfo/primary-pod
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 12 of 20...
+ sleep 5
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 12 of 20...
+ attempt=13
+ [ 13 -le 20 ]
+ cat /kb-podinfo/primary-pod
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 13 of 20...
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 13 of 20...
+ sleep 5
+ attempt=14
+ [ 14 -le 20 ]
+ cat /kb-podinfo/primary-pod
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 14 of 20...
+ sleep 5
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 14 of 20...
+ attempt=15
+ [ 15 -le 20 ]
+ cat /kb-podinfo/primary-pod
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 15 of 20...
+ sleep 5
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 15 of 20...
+ attempt=16
+ [ 16 -le 20 ]
+ cat /kb-podinfo/primary-pod
+ [ -z  ]
+ echo Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 16 of 20...
+ sleep 5
Waiting for primary pod information from the DownwardAPI annotation to be available, attempt 16 of 20...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants