-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: address comments for monitoring component #1520
Conversation
/test opendatahub-operator-e2e |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1520 +/- ##
==========================================
- Coverage 19.69% 19.63% -0.07%
==========================================
Files 161 161
Lines 11102 11137 +35
==========================================
Hits 2187 2187
- Misses 8683 8718 +35
Partials 232 232 ☔ View full report in Codecov by Sentry. |
@zdtsw Can you help also link the PR comment here? So that anyone trying to understand the changes can go back and read the context? |
updated in description. |
odhdeploy "github.com/opendatahub-io/opendatahub-operator/v2/pkg/deploy" | ||
) | ||
|
||
var ( | ||
ComponentName = serviceApi.MonitoringServiceName | ||
prometheusConfigPath = filepath.Join(odhdeploy.DefaultManifestPath, ComponentName, "prometheus", "apps", "prometheus-configs.yaml") | ||
ReadyConditionType = conditionsv1.ConditionType(serviceApi.MonitoringKind + status.ReadySuffix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for service/component, the ready condition should just be Ready
, the suffix is only needed if the service/component specific readiness condition is exposed in an higher level API (i.e. DSC/DSCI)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
if len(promDeployment.Items) == 1 && ready == 1 { | ||
// TODO: deprecate phase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the if
before should either be rewritten like len(promDeployment.Items) == ready
or there are more than the expected number of deployment should be reported as part of the failure condition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
nc := metav1.Condition{ | ||
Type: string(ReadyConditionType), | ||
Status: metav1.ConditionFalse, | ||
Reason: status.ReconcileInit, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if ReconcileInit
makes any sense to be honest, so if the deployment won't get up and running, then we would leave the reason to ReconcileInit
forever
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated not usre if it is good to use
Type: string(ReadyConditionType),
Status: metav1.ConditionFalse,
Reason: status.PhaseNotReady,
Message: "Prometheus deployment is not ready",
- more if to switch...case - add .status.condition.MonitoringReady type Signed-off-by: Wen Zhou <[email protected]>
- change Monitoring CR .status.condition Reason and Message and Type name - remove unused predicate var - change check on promethus deployment ready Signed-off-by: Wen Zhou <[email protected]>
@@ -501,7 +501,10 @@ func (r *DSCInitializationReconciler) configureMonitoring(ctx context.Context, d | |||
) | |||
|
|||
if dsci.Spec.Monitoring.ManagementState == operatorv1.Managed { | |||
err := r.Create(ctx, defaultMonitoring) | |||
// for generic case if we need to support configable monitoring namespace | |||
_, err := controllerutil.CreateOrUpdate(ctx, r.Client, defaultMonitoring, func() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can probably replace it with server side apply r.Apply(..)
Signed-off-by: Wen Zhou <[email protected]>
err := r.Create(ctx, defaultMonitoring) | ||
if err != nil && !k8serr.IsAlreadyExists(err) { | ||
// for generic case if we need to support configable monitoring namespace | ||
if err := r.Apply(ctx, defaultMonitoring); err != nil && !k8serr.IsAlreadyExists(err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setting field owner is required, something like
opendatahub-operator/controllers/datasciencecluster/datasciencecluster_controller.go
Lines 243 to 250 in 6a762ad
err := ctrl.SetControllerReference(instance, componentCR, r.Scheme) | |
if err != nil { | |
return nil, err | |
} | |
err = r.Client.Apply(ctx, componentCR, client.FieldOwner(fieldOwner), client.ForceOwnership) | |
if err != nil { | |
return nil, err | |
} |
} | ||
} else { | ||
falseVal := false | ||
updated = &falseVal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can use pointer.Bool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the meta.IsStatusConditionTrue(ci.GetStatus().Conditions, status.ConditionTypeReady) == true
can also be moved to a case
} | ||
|
||
// Check for shared components | ||
if ch.GetName() == componentApi.KserveComponentName || ch.GetName() == componentApi.ModelMeshServingComponentName { | ||
if err := UpdatePrometheusConfig(ctx, enabled, componentRules[componentApi.ModelControllerComponentName]); err != nil { | ||
if err := UpdatePrometheusConfig(ctx, *updated, componentRules[componentApi.ModelControllerComponentName]); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this result in a panic due to a nil ptr ?
return err | ||
} | ||
} | ||
|
||
if err := UpdatePrometheusConfig(ctx, enabled, componentRules[ch.GetName()]); err != nil { | ||
if err := UpdatePrometheusConfig(ctx, *updated, componentRules[ch.GetName()]); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this result in a panic due to a nil ptr ?
} | ||
|
||
// Check for shared components | ||
if ch.GetName() == componentApi.KserveComponentName || ch.GetName() == componentApi.ModelMeshServingComponentName { | ||
if err := UpdatePrometheusConfig(ctx, enabled, componentRules[componentApi.ModelControllerComponentName]); err != nil { | ||
if err := UpdatePrometheusConfig(ctx, *updated, componentRules[componentApi.ModelControllerComponentName]); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess here you have to check also for model controller readiness ?
trueVal := true | ||
updated = &trueVal | ||
|
||
if err := UpdatePrometheusConfig(ctx, *updated, componentRules[ch.GetName()]); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like you don't need the updated
var since it is always true
if !dsc.Status.InstalledComponents["model-mesh"] && !dsc.Status.InstalledComponents["kserve"] { | ||
falseVal := false | ||
updated = &falseVal | ||
if err := UpdatePrometheusConfig(ctx, *updated, componentRules[ch.GetName()]); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like you don't need the updated
var since it is always false
- update on modelcontroller - add filed manager for monitoring CR to DSCI Signed-off-by: Wen Zhou <[email protected]>
- remove duplicated function from pkg/service/montiroing - add isComponentReady() Signed-off-by: Wen Zhou <[email protected]>
Signed-off-by: Wen Zhou <[email protected]>
- remove duplicated monitoring predicates from DSCI controller - update predicate for monitoring on DSC change on both .spec.components and .status.components Signed-off-by: Wen Zhou <[email protected]>
Signed-off-by: Wen Zhou <[email protected]>
} | ||
|
||
// compare type one by one with their status if not equal return true | ||
for _, nc := range newConditions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can probably made a little bit more efficient, but ok for now
if err != nil && !k8serr.IsAlreadyExists(err) { | ||
// for generic case if we need to support configable monitoring namespace | ||
// set filed manager to DSCI | ||
if err := r.Apply(ctx, defaultMonitoring, client.FieldOwner("dscinitialization.opendatahub.io"), client.ForceOwnership); err != nil && !k8serr.IsAlreadyExists(err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: there is a finalizerName
const
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: lburgazzoli The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Description
How Has This Been Tested?
Screenshot or short clip
Merge criteria