Skip to content
This repository has been archived by the owner on Aug 19, 2024. It is now read-only.

spec.serviceName field in local DB StatefulSet operand is hardcoded and not correct #370

Closed
rm3l opened this issue May 29, 2024 · 1 comment
Labels
jira Issue will be sync'ed to Red Hat JIRA kind/bug Categorizes issue or PR as related to a bug.

Comments

@rm3l
Copy link
Member

rm3l commented May 29, 2024

/kind bug

Follow-up issue discovered while working on #369

What versions of software are you using?

  • Operator on current main branch: 5b0aeaa

What did you run exactly?

  • Run the operator against a cluster
make install run
kubectl apply -f examples/bs1.yaml
  • Inspect the local DB StatefulSet and the Services created by the operator
kubectl get statefulset --selector=app.kubernetes.io/name=backstage,app.kubernetes.io/instance=bs1 -o yaml | yq '.items[0].spec'

kubectl get service --selector=app.kubernetes.io/name=backstage,app.kubernetes.io/instance=bs1

Actual behavior

  • Here are the services created by the operator:
$ kubectl get service --selector=app.kubernetes.io/name=backstage,app.kubernetes.io/instance=bs1                              
NAME               TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
backstage-bs1      ClusterIP   172.31.39.19    <none>        80/TCP     6m36s
backstage-db-bs1   ClusterIP   172.31.20.189   <none>        5432/TCP   6m36s
  • And the spec of the DB StatefulSet, where we can see that spec.serviceName is hardcoded to backstage-psql-cr1-hl (and spec.template.metadata.name is also hardcoded to backstage-db-cr1, but this might not be an issue per se).
StatefulSet spec
persistentVolumeClaimRetentionPolicy:
  whenDeleted: Retain
  whenScaled: Retain
podManagementPolicy: OrderedReady
replicas: 1
revisionHistoryLimit: 10
selector:
  matchLabels:
    rhdh.redhat.com/app: backstage-db-bs1
serviceName: backstage-psql-cr1-hl
template:
  metadata:
    creationTimestamp: null
    labels:
      rhdh.redhat.com/app: backstage-db-bs1
    name: backstage-db-cr1
  spec:
    automountServiceAccountToken: false
    containers:
      - env:
          - name: POSTGRESQL_PORT_NUMBER
            value: "5432"
          - name: POSTGRESQL_VOLUME_DIR
            value: /var/lib/pgsql/data
          - name: PGDATA
            value: /var/lib/pgsql/data/userdata
        envFrom:
          - secretRef:
              name: backstage-db-bs1
        image: quay.io/fedora/postgresql-15:latest
        imagePullPolicy: IfNotPresent
        livenessProbe:
          exec:
            command:
              - /bin/sh
              - -c
              - exec pg_isready -U ${POSTGRES_USER} -h 127.0.0.1 -p 5432
          failureThreshold: 6
          initialDelaySeconds: 30
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 5
        name: postgresql
        ports:
          - containerPort: 5432
            name: tcp-postgresql
            protocol: TCP
        readinessProbe:
          exec:
            command:
              - /bin/sh
              - -c
              - -e
              - |
                exec pg_isready -U ${POSTGRES_USER} -h 127.0.0.1 -p 5432
          failureThreshold: 6
          initialDelaySeconds: 5
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 5
        resources:
          limits:
            cpu: 250m
            ephemeral-storage: 20Mi
            memory: 1Gi
          requests:
            cpu: 250m
            memory: 256Mi
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
              - ALL
          runAsGroup: 0
          runAsNonRoot: true
          seccompProfile:
            type: RuntimeDefault
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
          - mountPath: /dev/shm
            name: dshm
          - mountPath: /var/lib/pgsql/data
            name: data
    dnsPolicy: ClusterFirst
    restartPolicy: Always
    schedulerName: default-scheduler
    securityContext: {}
    serviceAccount: default
    serviceAccountName: default
    terminationGracePeriodSeconds: 30
    volumes:
      - emptyDir:
          medium: Memory
        name: dshm
updateStrategy:
  rollingUpdate:
    partition: 0
  type: RollingUpdate
volumeClaimTemplates:
  - apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      creationTimestamp: null
      name: data
    spec:
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 1Gi
      volumeMode: Filesystem
    status:
      phase: Pending

Expected behavior

Per the official K8s docs, a Headless Service is required for StatefulSets and must exist:

StatefulSets currently require a Headless Service to be responsible for the network identity of the Pods. You are responsible for creating this Service

Not sure why K8s does not return an error with this non-existing service, but the expected behavior would be:

  • spec.serviceName should not be hardcoded. Instead, it can have a value like backstage-psql-<cr_name>-hl (and this headless service should be created by the operator as well)
  • spec.template.metadata.name should not be hardcoded. Instead, it can have a value like backstage-psql-<cr_name>

Any logs, error output, etc?

@openshift-ci openshift-ci bot added the kind/bug Categorizes issue or PR as related to a bug. label May 29, 2024
@github-actions github-actions bot added the jira Issue will be sync'ed to Red Hat JIRA label May 29, 2024
@rm3l rm3l changed the title spec.serviceName field in local DB StatefulSet spec is hardcoded and not correct spec.serviceName field in local DB StatefulSet operand is hardcoded and not correct May 29, 2024
@gazarenkov
Copy link
Member

gazarenkov commented May 30, 2024

Our StatefulSet has never worked with headless service which mean only 1 replica configuration would be working correct.
Indeed, to make it work correct with more than one Pod-PV it has to be headless service instead of load-balancing one.

I would not qualify it as a bug though, it is rather a limitation for local (not recommended for production) database schema and, again, it has never worked with headless service yet.

spec.serviceName should not be hardcoded. Instead, it can have a value like backstage-psql-<cr_name>-hl (and this >headless service should be created by the operator as well)

I do not see why we need "-hl" and "non-hl" service here. If it will work with headless, let's leave only it (backstage-psql-<cr_name> as a serviceName).

spec.template.metadata.name should not be hardcoded. Instead, it can have a value like backstage-psql-<cr_name>

Unfortunatelly, looks like Statefulsert does not respect it (otherwise, there were no needs in renaming SS). The template for Pod names are still - . We can just remove it

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
jira Issue will be sync'ed to Red Hat JIRA kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

2 participants