Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broker restart resets counters to 0 #913

Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion modules/manage/partials/monitor-health.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,17 @@

This section provides guidelines and example queries using Redpanda's public metrics to optimize your system's performance and monitor its health.

TIP: To help detect and mitigate anomalous system behaviors, capture baseline metrics of your healthy system at different stages (at start-up, under high load, in steady state) so you can set thresholds and alerts according to those baselines.
To help detect and mitigate anomalous system behaviors, capture baseline metrics of your healthy system at different stages (at start-up, under high load, in steady state) so you can set thresholds and alerts according to those baselines.

[TIP]
kbatuigas marked this conversation as resolved.
Show resolved Hide resolved
====
For counter type metrics, a broker restart causes the count in tools such as Prometheus and Grafana to reset to zero. Redpanda recommends wrapping counter metrics in a rate query to account for broker restarts, for example:
kbatuigas marked this conversation as resolved.
Show resolved Hide resolved

[,promql]
----
rate(redpanda_kafka_records_produced_total[5m])
----
====

=== Redpanda architecture

Expand Down
8 changes: 2 additions & 6 deletions modules/manage/partials/monitor-redpanda.adoc
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
Redpanda exports metrics through Prometheus endpoints for you to monitor system health and optimize system performance.
Redpanda exports metrics through two Prometheus endpoints on the Admin API port (default: 9644) for you to monitor system health and optimize system performance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wzzzrd86 @masapanda is it correct that we call out "Prometheus" here?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think that is correct

Copy link
Contributor

@Deflaimun Deflaimun Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you solved this already, but just for the future, I don't think calling it "prometheus endpoints" is accurate. They're HTTP endpoints, through the Admin API, that the response body with a prometheus format.

A Redpanda broker exports public metrics from the xref:reference:public-metrics-reference.adoc[`/public_metrics`] endpoint through the Admin API port (default: 9644).

Before v22.2, a Redpanda broker provided metrics only through the xref:reference:internal-metrics-reference.adoc[`/metrics`] endpoint through the Admin API port. While Redpanda still provides this endpoint, it includes many internal metrics that are unnecessary for a typical Redpanda user to monitor. Consequently, the `/public_metrics` endpoint was added to provide a smaller set of important metrics that can be queried and ingested more quickly and inexpensively. The `/metrics` endpoint is now referred to as the 'internal metrics' endpoint, and Redpanda recommends that you use it for development, testing, and analysis.
The xref:reference:internal-metrics-reference.adoc[`/metrics`] is a legacy endpoint that includes many internal metrics that are unnecessary for a typical Redpanda user to monitor. The `/metrics` endpoint is also referred to as the 'internal metrics' endpoint, and Redpanda recommends that you use it for development, testing, and analysis. Alternatively, the xref:reference:public-metrics-reference.adoc[`/public_metrics`] endpoint provides a smaller set of important metrics that can be queried and ingested more quickly and inexpensively.

kbatuigas marked this conversation as resolved.
Show resolved Hide resolved
include::shared:partial$metrics-usage-tip.adoc[]

Expand All @@ -25,8 +23,6 @@ This topic covers the following about monitoring Redpanda metrics:

https://prometheus.io/[Prometheus^] is a system monitoring and alerting tool. It collects and stores metrics as time-series data identified by a metric name and key/value pairs.

NOTE: Redpanda Data recommends creating monitoring dashboards with `/public_metrics`.

ifdef::env-kubernetes[]

To configure Prometheus to monitor Redpanda metrics in Kubernetes, you can use the https://prometheus-operator.dev/[Prometheus Operator^]:
Expand Down
2 changes: 1 addition & 1 deletion modules/shared/partials/metrics-usage-tip.adoc
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[TIP]
====
Use xref:reference:public-metrics-reference.adoc[/public_metrics] for your primary dashboards for system health.
Use xref:reference:public-metrics-reference.adoc[/public_metrics] for your primary dashboards for monitoring system health.

Use xref:reference:internal-metrics-reference.adoc[/metrics] for detailed analysis and debugging.
====
Loading