Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot set tolerations/nodeSelector for temporary arangodb-cluster-id pod #110

Closed
ddelange opened this issue Oct 10, 2022 · 9 comments
Closed

Comments

@ddelange
Copy link

ddelange commented Oct 10, 2022

Hi! 👋

I'm trying to deploy an ArangoDeployment to a mixed amd/arm cluster, where the arm nodes have a no schedule taint.

When I boot a fresh cluster with spec.architecture: [arm64], there will be a temporary arangodb-cluster-id-xxx pod created, which has hard NodeAffinity for arm64, but no way to add tolerations/nodeSelector. This means the pod won't be scheduled on a node, and the cluster boot sequence will hang.

When I change to spec.architecture: [amd64, arm64], the temporary pod will be allowed to schedule on an amd node in the cluster, and the rest of the pods will have the toleration for arm and so will schedule on the arm nodes (as we also have a prefer no schedule taint on the amd nodes). This is an acceptable workaround for now, but I'd rather specify spec.architecture: [arm64] and get a successful boot.

Deleted the ArangoDeployment from above (but with arm64 in the spec), and now trying to re-create it.

The operator creates the following id Pod, with arm64 nodeAffinity, but without arm64 tolerations. We have tolerations in the ArangoDeployment for controller,agent,prmr -- because we have a mixed amd64/arm64 cluster, and all arm64 nodes are tainted such that they are opt-in. Now, because of the hard nodeAffinity but missing tolerations on the id Pod, our ArangoDeployment won't boot at all 😅

arangodb-cluster-id-162f0e

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: "2022-09-23T07:48:45Z"
  labels:
    app: arangodb
    arango_deployment: arangodb-cluster
    deployment.arangodb.com/member: 162f0e
    role: id
  managedFields:
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:labels:
          .: {}
          f:app: {}
          f:arango_deployment: {}
          f:deployment.arangodb.com/member: {}
          f:role: {}
        f:ownerReferences:
          .: {}
          k:{"uid":"6054f633-c606-4f75-b7ee-a28430e1b952"}: {}
      f:spec:
        f:affinity:
          .: {}
          f:nodeAffinity:
            .: {}
            f:requiredDuringSchedulingIgnoredDuringExecution: {}
          f:podAntiAffinity:
            .: {}
            f:preferredDuringSchedulingIgnoredDuringExecution: {}
        f:containers:
          k:{"name":"server"}:
            .: {}
            f:command: {}
            f:image: {}
            f:imagePullPolicy: {}
            f:name: {}
            f:ports:
              .: {}
              k:{"containerPort":8529,"protocol":"TCP"}:
                .: {}
                f:containerPort: {}
                f:name: {}
                f:protocol: {}
            f:resources: {}
            f:securityContext:
              .: {}
              f:capabilities:
                .: {}
                f:drop: {}
            f:terminationMessagePath: {}
            f:terminationMessagePolicy: {}
            f:volumeMounts:
              .: {}
              k:{"mountPath":"/data"}:
                .: {}
                f:mountPath: {}
                f:name: {}
        f:dnsPolicy: {}
        f:enableServiceLinks: {}
        f:hostname: {}
        f:restartPolicy: {}
        f:schedulerName: {}
        f:securityContext: {}
        f:subdomain: {}
        f:terminationGracePeriodSeconds: {}
        f:tolerations: {}
        f:volumes:
          .: {}
          k:{"name":"arangod-data"}:
            .: {}
            f:emptyDir: {}
            f:name: {}
    manager: arangodb_operator
    operation: Update
    time: "2022-09-23T07:48:45Z"
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        f:conditions:
          .: {}
          k:{"type":"PodScheduled"}:
            .: {}
            f:lastProbeTime: {}
            f:lastTransitionTime: {}
            f:message: {}
            f:reason: {}
            f:status: {}
            f:type: {}
    manager: kube-scheduler
    operation: Update
    subresource: status
    time: "2022-09-23T07:48:45Z"
  name: arangodb-cluster-id-162f0e
  namespace: aa-data-api
  ownerReferences:
  - apiVersion: database.arangodb.com/v1
    controller: true
    kind: ArangoDeployment
    name: arangodb-cluster
    uid: 6054f633-c606-4f75-b7ee-a28430e1b952
  resourceVersion: "109818868"
  uid: cf969e3d-6bc1-4395-a774-070b6b629c1e
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/arch
            operator: In
            values:
            - arm64
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          labelSelector:
            matchLabels:
              app: arangodb
              arango_deployment: arangodb-cluster
              role: id
          topologyKey: kubernetes.io/hostname
        weight: 1
  containers:
  - command:
    - /usr/sbin/arangod
    - --server.authentication=false
    - --server.endpoint=tcp://[::]:8529
    - --database.directory=/data
    - --log.output=+
    image: arangodb/arangodb-preview:3.10.0-beta.1
    imagePullPolicy: IfNotPresent
    name: server
    ports:
    - containerPort: 8529
      name: server
      protocol: TCP
    resources: {}
    securityContext:
      capabilities:
        drop:
        - ALL
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /data
      name: arangod-data
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-mzlkl
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  hostname: arangodb-cluster-id-162f0e
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Never
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: default
  serviceAccountName: default
  subdomain: arangodb-cluster-int
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 5
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 5
  - effect: NoExecute
    key: node.alpha.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 5
  volumes:
  - emptyDir: {}
    name: arangod-data
  - name: kube-api-access-mzlkl
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2022-09-23T07:48:45Z"
    message: '0/22 nodes are available: 17 node(s) had taint {arch: arm64}, that the
      pod didn''t tolerate, 2 node(s) didn''t match Pod''s node affinity/selector,
      3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn''t
      tolerate.'
    reason: Unschedulable
    status: "False"
    type: PodScheduled
  phase: Pending
  qosClass: BestEffort

Originally posted by @ddelange in #53 (comment)

@ddelange
Copy link
Author

correction: the cluster boots succesfully with my dual architecture setup, but i think that's also a bug:

with that setup, the prmr pods will end up with only one nodeaffinity, and so there will still be no pods going onto the arm nodes:

spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/arch
            operator: In
            values:
            - amd64

@ddelange
Copy link
Author

ddelange commented Oct 10, 2022

when i switch the order in spec.architecture, we're back at the original issue, where the temporary pod doesn't schedule. it seems that only the first entry in spec.architecture is respected:

0/13 nodes are available: 1 node(s) didn't match Pod's node affinity/selector, 1 node(s) were unschedulable, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 8 node(s) had taint {arch: arm64}, that the pod didn't tolerate.

@ddelange
Copy link
Author

fwiw in the original implementation there was an error "Only one architecture type is supported currently", but that seems to have disappeared before/during/after merging that PR

@ddelange
Copy link
Author

so I now hacked our toleration

  - effect: NoSchedule
    key: arch
    operator: Equal
    value: arm64

into the spec of the temporary pod and the cluster now successfully booted on arm

@ddelange
Copy link
Author

when i switch the order in spec.architecture, we're back at the original issue, where the temporary pod doesn't schedule. it seems that only the first entry in spec.architecture is respected:

0/13 nodes are available: 1 node(s) didn't match Pod's node affinity/selector, 1 node(s) were unschedulable, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 8 node(s) had taint {arch: arm64}, that the pod didn't tolerate.

opened arangodb/kube-arangodb#1140

@ddelange
Copy link
Author

cc @jwierzbo from #53 (comment)

I guess this issue should live in kube-arangodb actually 😅 should I close and re-open there?

@dothebart
Copy link
Contributor

yes, I think as long as our docker container doesn't anything wrong about this topic, this issue doesn't belong here, and it should be discussed inside of the http://github.com/arangodb/kube-arangodb repo.
The multiarch support of the container should be working properly as of ArangoDB 3.10, right?

@ddelange
Copy link
Author

yes, all green until now! I'll re-open there

@ddelange
Copy link
Author

closing in favor of arangodb/kube-arangodb#1141

@ddelange ddelange closed this as not planned Won't fix, can't repro, duplicate, stale Oct 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants