-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Available CRDs check feature #2712
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,6 +4,8 @@ import ( | |
"context" | ||
error1 "errors" | ||
"fmt" | ||
"os" | ||
"reflect" | ||
"strings" | ||
"time" | ||
|
||
|
@@ -159,14 +161,24 @@ func (r *StorageClusterReconciler) Reconcile(ctx context.Context, request reconc | |
return reconcile.Result{}, err | ||
} | ||
|
||
var err error | ||
availableCrds, err := statusutil.MapCRDAvailability(ctx, r.Client, statusutil.GetCrds()...) | ||
if err != nil { | ||
r.Log.Error(err, "error fetching the CRDs") | ||
return reconcile.Result{}, err | ||
} | ||
if !reflect.DeepEqual(availableCrds, r.AvailableCrds) { | ||
r.Log.Info("CustomResourceDefinitions created/deleted. Restarting process.") | ||
os.Exit(42) | ||
} | ||
Comment on lines
+170
to
+173
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Available CRD is a map. I am not sure if there is an order guarantee on keys which means that DeepEqual might fail even if all the keys and values are the same. I would suggest changing the check to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I tried to create a map, added keys in a different order, and compared them using DeepEqual and it worked. maps.Equal produces same results |
||
|
||
if err := r.validateStorageClusterSpec(sc); err != nil { | ||
return reconcile.Result{}, err | ||
} | ||
|
||
r.IsNoobaaStandalone = sc.Spec.MultiCloudGateway != nil && | ||
ReconcileStrategy(sc.Spec.MultiCloudGateway.ReconcileStrategy) == ReconcileStrategyStandalone | ||
|
||
var err error | ||
r.clusters, err = statusutil.GetClusters(ctx, r.Client) | ||
if err != nil { | ||
r.Log.Error(err, "Failed to get clusters") | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Where is the conditional watch that checks the CRD exists before adding the watch? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added it in #2745 |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -96,6 +96,7 @@ type StorageClusterReconciler struct { | |
IsMultipleStorageClusters bool | ||
clusters *util.Clusters | ||
OperatorNamespace string | ||
AvailableCrds map[string]bool | ||
} | ||
|
||
// SetupWithManager sets up a controller with manager | ||
|
@@ -211,6 +212,31 @@ func (r *StorageClusterReconciler) SetupWithManager(mgr ctrl.Manager) error { | |
}, | ||
} | ||
|
||
crdPredicate := predicate.Funcs{ | ||
CreateFunc: func(e event.TypedCreateEvent[client.Object]) bool { | ||
crdAvailable, keyExist := r.AvailableCrds[e.Object.GetName()] | ||
if keyExist && !crdAvailable { | ||
r.Log.Info("CustomResourceDefinition %s was Created.", e.Object.GetName()) | ||
return true | ||
} | ||
return false | ||
}, | ||
DeleteFunc: func(e event.TypedDeleteEvent[client.Object]) bool { | ||
crdAvailable, keyExist := r.AvailableCrds[e.Object.GetName()] | ||
if keyExist && crdAvailable { | ||
r.Log.Info("CustomResourceDefinition %s was Deleted.", e.Object.GetName()) | ||
return true | ||
} | ||
return false | ||
}, | ||
Comment on lines
+224
to
+231
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is there a possibility that CRD deletion is less than our operator lifetime? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Although it is very unlikely but it is possible, and if they happens it will lead to a pod restart |
||
UpdateFunc: func(e event.TypedUpdateEvent[client.Object]) bool { | ||
return false | ||
}, | ||
GenericFunc: func(e event.TypedGenericEvent[client.Object]) bool { | ||
return false | ||
}, | ||
} | ||
|
||
build := ctrl.NewControllerManagedBy(mgr). | ||
For(&ocsv1.StorageCluster{}, builder.WithPredicates(scPredicate)). | ||
Owns(&cephv1.CephCluster{}, builder.WithPredicates(cephClusterIgnoreTimeUpdatePredicate)). | ||
|
@@ -228,6 +254,7 @@ func (r *StorageClusterReconciler) SetupWithManager(mgr ctrl.Manager) error { | |
Owns(&corev1.Secret{}, builder.WithPredicates(predicate.GenerationChangedPredicate{})). | ||
Owns(&routev1.Route{}). | ||
Owns(&templatev1.Template{}). | ||
Watches(&extv1.CustomResourceDefinition{}, enqueueStorageClusterRequest, builder.WithPredicates(crdPredicate)). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we already made a mistake (with Virt) in watching all CRDs increasing the memory footprint, my mistake in not pushing hard on this fix #2539, seeing this PR we don't want to watch all CRDs until & unless necessary. Clubbed w/ 2539 we could do something like below as we are only interested in the metadata of CRDs while mapping CRDs.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looked at this and I couldn't find a way to specify multiple names in the cache using field selector, But I have updated it to use PartialObjectMetadata here: #2745 |
||
Watches(&storagev1.StorageClass{}, enqueueStorageClusterRequest). | ||
Watches(&volumesnapshotv1.VolumeSnapshotClass{}, enqueueStorageClusterRequest). | ||
Watches(&ocsclientv1a1.StorageClient{}, enqueueStorageClusterRequest). | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,6 +11,7 @@ import ( | |
ocsv1 "github.com/red-hat-storage/ocs-operator/api/v4/v1" | ||
corev1 "k8s.io/api/core/v1" | ||
storagev1 "k8s.io/api/storage/v1" | ||
apiextensionsv1 "k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1" | ||
"k8s.io/apimachinery/pkg/labels" | ||
"k8s.io/apimachinery/pkg/types" | ||
"sigs.k8s.io/controller-runtime/pkg/client" | ||
|
@@ -44,6 +45,10 @@ const ( | |
OwnerUIDIndexName = "ownerUID" | ||
) | ||
|
||
func GetCrds() []string { | ||
return []string{} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is my understanding correct that in upcoming PRs we register the CRDNames in this and conditionally watch on them in the setupwithmanager? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added it with #2745 |
||
} | ||
|
||
// GetWatchNamespace returns the namespace the operator should be watching for changes | ||
func GetWatchNamespace() (string, error) { | ||
ns, found := os.LookupEnv(WatchNamespaceEnvVar) | ||
|
@@ -149,3 +154,16 @@ func GenerateNameForNonResilientCephBlockPoolSC(initData *ocsv1.StorageCluster) | |
} | ||
return fmt.Sprintf("%s-ceph-non-resilient-rbd", initData.Name) | ||
} | ||
|
||
func MapCRDAvailability(ctx context.Context, clnt client.Client, crdNames ...string) (map[string]bool, error) { | ||
crdExist := map[string]bool{} | ||
for _, crdName := range crdNames { | ||
crd := &apiextensionsv1.CustomResourceDefinition{} | ||
crd.Name = crdName | ||
if err := clnt.Get(ctx, client.ObjectKeyFromObject(crd), crd); client.IgnoreNotFound(err) != nil { | ||
return nil, fmt.Errorf("error getting CRD, %v", err) | ||
} | ||
crdExist[crdName] = crd.UID != "" | ||
} | ||
return crdExist, nil | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
#!/bin/bash | ||
|
||
RESTART_EXIT_CODE=42 | ||
|
||
while true; do | ||
./usr/local/bin/ocs-operator --enable-leader-election --health-probe-bind-address=:8081 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we are effectively making this a sub-process rather than a PID 1 and I'm not confident enough in the shell handling signals vs controller-runtime. I could see good amount of comments on this PR, in brief, may I know why did we backoff from kubelet restarting us? |
||
EXIT_CODE=$? | ||
if [ $EXIT_CODE -ne $RESTART_EXIT_CODE ]; then | ||
exit $EXIT_CODE | ||
fi | ||
done | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is a missing newline here and git is complaining There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we please have the exit code in a constant with a name that explains the usage?
In addition, a comment about this line will help
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added it with #2745