You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
i want same alert(alert rule) to be fire after 5 min, currently i am getting same alert (alert rule) after every one minute for same '{{ $value }}'.
if the threshold cross and value changes, it fires multiple alerts having same alert rule thats fine. But with same '{{ $value }}' it should fire alerts after 5 min. same alert rule with same value should not get fire for next 5 min. how to get this ??
even if application is not down, it sends alerts every 1 min. how to debug this i am using below exp:- alert: "Instance Down" expr: up == 0
whats is for, keep_firing_for and evaluation_interval ?
prometheus.yml
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
alerting:
alertmanagers:
- static_configs:
- targets:
- ip:port
rule_files:
- "alerts_rules.yml"
scrape_configs:
- job_name: "prometheus"
static_configs:
- targets: ["ip:port"]
groups:
- name: instance_alerts
rules:
- alert: "Instance Down"
expr: up == 0
# for: 30s
# keep_firing_for: 30s
labels:
severity: "Critical"
annotations:
summary: "Endpoint {{ $labels.instance }} down"
description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 30 sec."
- name: rabbitmq_alerts
rules:
- alert: "Consumer down for last 1 min"
expr: rabbitmq_queue_consumers == 0
# for: 1m
# keep_firing_for: 30s
labels:
severity: Critical
annotations:
summary: "shortify | '{{ $labels.queue }}' has no consumers"
description: "The queue '{{ $labels.queue }}' in vhost '{{ $labels.vhost }}' has zero consumers for more than 30 sec. Immediate attention is required."
- alert: "Total Messages > 10k in last 1 min"
expr: rabbitmq_queue_messages > 10000
# for: 1m
# keep_firing_for: 30s
labels:
severity: Critical
annotations:
summary: "'{{ $labels.queue }}' has total '{{ $value }}' messages for more than 1 min."
description: |
Queue {{ $labels.queue }} in RabbitMQ has total {{ $value }} messages for more than 1 min.
The text was updated successfully, but these errors were encountered:
amolngt
changed the title
prometheus alerting - alerts are getting fire after every minute
Alerts are getting fire after every minute
Feb 13, 2025
Hi all,
i want same alert(alert rule) to be fire after 5 min, currently i am getting same alert (alert rule) after every one minute for same '{{ $value }}'.
if the threshold cross and value changes, it fires multiple alerts having same alert rule thats fine. But with same '{{ $value }}' it should fire alerts after 5 min. same alert rule with same value should not get fire for next 5 min. how to get this ??
even if application is not down, it sends alerts every 1 min. how to debug this i am using below exp:- alert: "Instance Down" expr: up == 0
whats is for, keep_firing_for and evaluation_interval ?
prometheus.yml
alertmanager.yml
alerts_rules.yml
The text was updated successfully, but these errors were encountered: