Skip to content

Commit

Permalink
Call out counter reset on broker restart plus minor page edits
Browse files Browse the repository at this point in the history
  • Loading branch information
kbatuigas committed Dec 11, 2024
1 parent ff00f47 commit c9e9f20
Show file tree
Hide file tree
Showing 3 changed files with 14 additions and 8 deletions.
12 changes: 11 additions & 1 deletion modules/manage/partials/monitor-health.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,17 @@

This section provides guidelines and example queries using Redpanda's public metrics to optimize your system's performance and monitor its health.

TIP: To help detect and mitigate anomalous system behaviors, capture baseline metrics of your healthy system at different stages (at start-up, under high load, in steady state) so you can set thresholds and alerts according to those baselines.
To help detect and mitigate anomalous system behaviors, capture baseline metrics of your healthy system at different stages (at start-up, under high load, in steady state) so you can set thresholds and alerts according to those baselines.

[TIP]
====
For counter type metrics, a broker restart causes the count in tools such as Prometheus and Grafana to reset to zero. Redpanda recommends wrapping counter metrics in a rate query to account for broker restarts, for example:
[,promql]
----
rate(redpanda_kafka_records_produced_total[5m])
----
====

=== Redpanda architecture

Expand Down
8 changes: 2 additions & 6 deletions modules/manage/partials/monitor-redpanda.adoc
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
Redpanda exports metrics through Prometheus endpoints for you to monitor system health and optimize system performance.
Redpanda exports metrics through two Prometheus endpoints on the Admin API port (default: 9644) for you to monitor system health and optimize system performance.

A Redpanda broker exports public metrics from the xref:reference:public-metrics-reference.adoc[`/public_metrics`] endpoint through the Admin API port (default: 9644).

Before v22.2, a Redpanda broker provided metrics only through the xref:reference:internal-metrics-reference.adoc[`/metrics`] endpoint through the Admin API port. While Redpanda still provides this endpoint, it includes many internal metrics that are unnecessary for a typical Redpanda user to monitor. Consequently, the `/public_metrics` endpoint was added to provide a smaller set of important metrics that can be queried and ingested more quickly and inexpensively. The `/metrics` endpoint is now referred to as the 'internal metrics' endpoint, and Redpanda recommends that you use it for development, testing, and analysis.
The xref:reference:internal-metrics-reference.adoc[`/metrics`] is a legacy endpoint that includes many internal metrics that are unnecessary for a typical Redpanda user to monitor. The `/metrics` endpoint is also referred to as the 'internal metrics' endpoint, and Redpanda recommends that you use it for development, testing, and analysis. Alternatively, the xref:reference:public-metrics-reference.adoc[`/public_metrics`] endpoint provides a smaller set of important metrics that can be queried and ingested more quickly and inexpensively.

include::shared:partial$metrics-usage-tip.adoc[]

Expand All @@ -25,8 +23,6 @@ This topic covers the following about monitoring Redpanda metrics:

https://prometheus.io/[Prometheus^] is a system monitoring and alerting tool. It collects and stores metrics as time-series data identified by a metric name and key/value pairs.

NOTE: Redpanda Data recommends creating monitoring dashboards with `/public_metrics`.

ifdef::env-kubernetes[]

To configure Prometheus to monitor Redpanda metrics in Kubernetes, you can use the https://prometheus-operator.dev/[Prometheus Operator^]:
Expand Down
2 changes: 1 addition & 1 deletion modules/shared/partials/metrics-usage-tip.adoc
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[TIP]
====
Use xref:reference:public-metrics-reference.adoc[/public_metrics] for your primary dashboards for system health.
Use xref:reference:public-metrics-reference.adoc[/public_metrics] for your primary dashboards for monitoring system health.
Use xref:reference:internal-metrics-reference.adoc[/metrics] for detailed analysis and debugging.
====

0 comments on commit c9e9f20

Please sign in to comment.