Skip to content
This repository has been archived by the owner on Aug 19, 2024. It is now read-only.

Existing Backstage operand not upgraded (stuck on mounting a ConfigMap) after upgrading operator from 1.1.x to 1.2.x #382

Closed
rm3l opened this issue Jun 7, 2024 · 0 comments · Fixed by #384
Assignees
Labels
jira Issue will be sync'ed to Red Hat JIRA kind/bug Categorizes issue or PR as related to a bug.

Comments

@rm3l
Copy link
Member

rm3l commented Jun 7, 2024

/kind bug

What did you do exactly?

  • From the operator repo, switch to the 1.1.x branch and deploy the operator from RHDH 1.1 (setting the IMG arg because the default image for this branch (quay.io/janus-idp/operator:0.1.3) has expired and no longer exists on quay.io)
git switch 1.1.x
make deploy IMG=quay.io/rhdh/rhdh-rhel9-operator:1.1
kubectl apply -f examples/bs1.yaml
  • Wait a few seconds until all the resources are created

  • Check the Backstage Custom Resource status. Reason should be DeployOK:

$ kubectl describe backstage bs1                                                                                                                                                           
Name:         bs1                                                                                                                                                                              
Namespace:    my-ns                                                                                                                                                                            
Labels:       <none>                                                                                                                                                                           
Annotations:  <none>                                                                                                                                                                           
API Version:  rhdh.redhat.com/v1alpha1                                                                                                                                                         
Kind:         Backstage                                                                                                                                                                        
Metadata:                                      
  Creation Timestamp:  2024-06-07T12:21:58Z                                                                                                                                                    
  Generation:          1                                                                       
  Resource Version:    48634                   
  UID:                 e4e9766f-6c32-4b44-85cd-1eac93b56f16                                                                                                                                    
Status:                                                                                                                                                                                        
  Conditions:                      
    Last Transition Time:  2024-06-07T12:21:58Z 
    Message:                                                                                                                                                                                       Reason:                DeployOK                                                                                                                                                            
    Status:                True                                                                                                                                                                
    Type:                  Deployed            
Events:                    <none>
  • Switch to the 1.2.x branch and deploy the upcoming 1.2 operator
git switch 1.2.x
make deploy
  • Wait a few seconds until the new version of the operator pod is running and the existing CR is reconciled again. Then check the CR status again. Reason will be DeployFailed with an error :
$ kubectl describe backstage bs1
Name:         bs1
Namespace:    my-ns
Labels:       <none>
Annotations:  <none>
API Version:  rhdh.redhat.com/v1alpha1
Kind:         Backstage
Metadata:
  Creation Timestamp:  2024-06-07T13:44:15Z
  Generation:          1
  Resource Version:    2846
  UID:                 5cdad2eb-0840-4c8e-8c2b-08be46e3856a
Status:
  Conditions:
    Last Transition Time:  2024-06-07T13:49:06Z
    Message:               failed to apply backstage objects failed to patch object &Service{ObjectMeta:{backstage-psql-bs1  my-ns   2100 0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[app.kubernetes.io/instance:bs1 app.kubernetes.io/name:backstage rhdh.redhat.com/app:backstage-psql-bs1] map[] [{rhdh.redhat.com/v1alpha1 Backstage bs1 5cdad2eb-0840-4c8e-8c2b-08be46e3856a 0xc000590f55 0xc000590f54}] [] []},Spec:ServiceSpec{Ports:[]ServicePort{ServicePort{Name:,Protocol:,Port:5432,TargetPort:{0 0 },NodePort:0,AppProtocol:nil,},},Selector:map[string]string{rhdh.redhat.com/app: backstage-psql-bs1,},ClusterIP:None,Type:,ExternalIPs:[],SessionAffinity:,LoadBalancerIP:,LoadBalancerSourceRanges:[],ExternalName:,ExternalTrafficPolicy:,HealthCheckNodePort:0,PublishNotReadyAddresses:false,SessionAffinityConfig:nil,IPFamilyPolicy:nil,ClusterIPs:[],IPFamilies:[],AllocateLoadBalancerNodePorts:nil,LoadBalancerClass:nil,InternalTrafficPolicy:nil,},Status:ServiceStatus{LoadBalancer:LoadBalancerStatus{Ingress:[]LoadBalancerIngress{},},Conditions:[]Condition{},},}: failed to patch object *v1.Service: Service "backstage-psql-bs1" is invalid: spec.clusterIPs[0]: Invalid value: []string{"None"}: may not change once set
    Reason:                DeployFailed
    Status:                False
    Type:                  Deployed
Events:                    <none>

Actual behavior

It seems the existing CR could not be reconciled successfully with the new version of the operator because it was unable to patch the existing database Service object.

If we take a look at the resources, a new Backstage pod was in the process of being created, but is stuck on trying a mount a ConfigMap (which could not be created because of the failure to patch the DB Service):

$ kubectl get pod

NAME                             READY   STATUS     RESTARTS   AGE
backstage-psql-bs1-0             1/1     Running    0          9m22s
backstage-bs1-655f659ddc-n7grw   1/1     Running    0          9m22s
backstage-bs1-6469fdd48f-ldq5h   0/1     Init:0/1   0          4m31s

$ kubectl describe pod backstage-bs1-6469fdd48f-ldq5h

[...]
Events:
  Type     Reason            Age                   From               Message
  ----     ------            ----                  ----               -------
  Warning  FailedScheduling  5m18s                 default-scheduler  0/1 nodes are available: waiting for ephemeral volume controller to create the persistentvolumeclaim "backstage-bs1-6469fdd48f-ldq5h-dynamic-plugins-root". preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod..
  Normal   Scheduled         5m11s                 default-scheduler  Successfully assigned my-ns/backstage-bs1-6469fdd48f-ldq5h to k3d-k3s-default-server-0
  Warning  FailedMount       3m9s                  kubelet            Unable to attach or mount volumes: unmounted volumes=[backstage-appconfig-bs1], unattached volumes=[backstage-appconfig-bs1 dynamic-plugins-root dynamic-plugins-npmrc backstage-dynamic-plugins-bs1]: timed out waiting for the condition
  Warning  FailedMount       62s (x10 over 5m12s)  kubelet            MountVolume.SetUp failed for volume "backstage-appconfig-bs1" : configmap "backstage-appconfig-bs1" not found
  Warning  FailedMount       51s                   kubelet            Unable to attach or mount volumes: unmounted volumes=[backstage-appconfig-bs1], unattached volumes=[dynamic-plugins-root dynamic-plugins-npmrc backstage-dynamic-plugins-bs1 backstage-appconfig-bs1]: timed out waiting for the condition

Note that the same issue happens when upgrading from the operator channels on OpenShift (https://github.com/janus-idp/operator/blob/main/.rhdh/docs/installing-ci-builds.adoc).

Expected behavior

CR reconciliation should be successful, and the application should be upgraded.

@openshift-ci openshift-ci bot added the kind/bug Categorizes issue or PR as related to a bug. label Jun 7, 2024
@github-actions github-actions bot added the jira Issue will be sync'ed to Red Hat JIRA label Jun 7, 2024
@rm3l rm3l self-assigned this Jun 11, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
jira Issue will be sync'ed to Red Hat JIRA kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
1 participant