Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix 'ceph_disk_occupation' query expressions #2812

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

aruniiird
Copy link
Contributor

@aruniiird aruniiird commented Sep 19, 2024

Need to address changes in 'ceph_disk_occupation' metric labels.

What is the change in 'ceph_disk_occupation' metric?
'ceph_disk_occupation' result no longer has 'exported_instance' label, instead it has 'instance' label.

What is the issue we are facing because of it?
We are hitting 'PrometheusRuleFailures' due to this new label change in our alerts / rules.
Second issue is that we are not seeing any results for some of the query expressions.

What is the solution?
Update the query expressions, change 'exported_instance' to 'instance'. Any 'label_replace' action which changes 'exported_instance' label to 'instance' label is no longer required (as the 'instance' label is directly available now)

Copy link
Contributor

openshift-ci bot commented Sep 19, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: aruniiird
Once this PR has been reviewed and has the lgtm label, please assign iamniting for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@aruniiird
Copy link
Contributor Author

@weirdwiz , @jmolmo , @umangachapagain please take a look.
Tested all the changed expressions in a live cluster with expected results (where as previous queries were either not giving any result or giving prometheus-exprn-evaluation-errors).

Need to address changes in 'ceph_disk_occupation' metric labels.

What is the change in 'ceph_disk_occupation' metric?
'ceph_disk_occupation' result no longer has 'exported_instance' label,
instead it has 'instance' label.

What is the issue we are facing because of it?
We are hitting 'PrometheusRuleFailures' due to this new label changes
in our alerts / rules, where this metric is used.
Second issue is that we are not seeing any results for some of the
query expressions.

What is the solution?
Update the query expressions, change 'exported_instance' to 'instance'.
Any 'label_replace' action which changes 'exported_instance' label to
'instance' label is no longer required (as the 'instance' label is
directly available now)

Signed-off-by: Arun Kumar Mohan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant