Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong alert rules defined for latency #43

Open
wei-lee opened this issue Nov 27, 2020 · 6 comments
Open

Wrong alert rules defined for latency #43

wei-lee opened this issue Nov 27, 2020 · 6 comments

Comments

@wei-lee
Copy link

wei-lee commented Nov 27, 2020

At the moment, the generated alert rules for latency is something like this:

latencytarget:http_request_duration_seconds:rate1h{job="prometheus",latency="0.10000000000000001"} > (14.4*1.000000)

The corresponding recording rule for this alert is something like this:

1 - (
        sum(rate(http_request_duration_seconds_bucket{job="prometheus",le="0.10000000000000001",code!~"5.."}[1h]))
        /
        sum(rate(http_request_duration_seconds_count{job="prometheus"}[1h]))
      )

which means the value of this recording rule will never be bigger than 1. That means the alert will never be fired.

If my understanding is correct, we should either multiply the recording rule by 100, or change 1 to 0.01 in the alerts.

@brancz
Copy link
Contributor

brancz commented Dec 29, 2020

I think you might have misconfigured the latencyBudget, as the value in the alerting rule is already templated.

@rporres
Copy link

rporres commented Jan 13, 2021

Then maybe the problem is in the code generating the rules in https://promtools.dev/alerts/latency, as it is what @wei-lee used to generate the alerts. @metalmatze should know better 😄

@metalmatze
Copy link
Owner

Yes, that's probably a problem with promtools.dev itself.
If you want to take a look to find the problem here: https://github.com/metalmatze/promtools.dev/blob/master/main.go
Otherwise I can see to fix it with another update I have already been working on anyway :)

@rporres
Copy link

rporres commented Apr 28, 2021

The problem looks indeed related with promtools.dev and the latencyBudget, as pointed out by @brancz . There's a fix proposal in metalmatze/promtools.dev#13

@rporres
Copy link

rporres commented May 10, 2021

This can be safely closed.

@tbuchier
Copy link

tbuchier commented Jul 6, 2021

Issue is still present on https://promtools.dev/alerts/latency

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants