You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Telegraf prometheusremotewrite data format parser Prometheus native histogram into one single Telegraf metric (instead of multiple Telegraf metrics), and its serializer should be able to serialize it back to a Prometheus native histogram. Design at the end.
Use Case
We are on a Prometheus stack and is planning to use Telegraf on data ingestion path (for some aggregation). This is a simplified view of our design.
Some of our metrics are native histogram, a new histogram model introduced by Prometheus. Rather than getting emitted as several metrics (_sum, _count, many _bucket with les) it encodes a protobuf struct and emits a single time series. It not only guarantees atomicity and thus resolves the writing batch problem that's present in classic histogram but also offers better resolution, query accuracy at a lower cost.
I PoC-ed a simple Telegraf ingest and output (aggregation logic not added yet) and put it in out ingestion path. Native histogram metrics are only available in protobuf exposition format - so prometheusremotewrite data format seems the right choice. Important configs are:
http_listener_v2 prometheusremotewrite format: a native histogram should be ingested and parsed into one single Telegraf metric, not breaking its atomicity.
http output prometheusremotewrite format: a native histogram should be written out, if a native histogram is ingested.
How exactly a native histogram metric should be parsed into one single Telegraf metric (data representation) is worth a design, so that it is:
not difficult for writing aggregators (starlark etc) for it
(even better) amenable to existing processors
(even better) reusable logic to openmetrics exponential histogram support
Actual behavior
Currently, support for ingesting native histogram is implemented in this PR: #14952
This causes the parser to break down a native histogram into many Telegraf metrics (_sum_count and many _bucket), as if it is a classic histogram. When getting written out by http output, it serializes into several separate Prometheus metrics, instead of one native histogram. This means all the benefits from native histogram (atomicity, reduced cardinality, better performance) are lost.
Additional info
Proposal:
We need to change how prometheusremotewrite parser handles a prom native histogram. It should parse it into one single Telegraf metric.
We need to change how prometheusremotewrite serializer so that it converts back such an Telegraf metrics to a prom native histogram.
A high-level design:
The text was updated successfully, but these errors were encountered:
Reimirno
changed the title
End-to-end Prometheus Native Histogram Support
prometheusremotewrite: End-to-end Prometheus Native Histogram Support
Oct 31, 2024
TLDR
Telegraf
prometheusremotewrite
data format parser Prometheus native histogram into one single Telegraf metric (instead of multiple Telegraf metrics), and its serializer should be able to serialize it back to a Prometheus native histogram. Design at the end.Use Case
We are on a Prometheus stack and is planning to use Telegraf on data ingestion path (for some aggregation). This is a simplified view of our design.
Some of our metrics are native histogram, a new histogram model introduced by Prometheus. Rather than getting emitted as several metrics (
_sum
,_count
, many_bucket
withle
s) it encodes a protobuf struct and emits a single time series. It not only guarantees atomicity and thus resolves the writing batch problem that's present in classic histogram but also offers better resolution, query accuracy at a lower cost.I PoC-ed a simple Telegraf ingest and output (aggregation logic not added yet) and put it in out ingestion path. Native histogram metrics are only available in protobuf exposition format - so
prometheusremotewrite
data format seems the right choice. Important configs are:Expected behavior
How exactly a native histogram metric should be parsed into one single Telegraf metric (data representation) is worth a design, so that it is:
Actual behavior
Currently, support for ingesting native histogram is implemented in this PR: #14952
This causes the parser to break down a native histogram into many Telegraf metrics (
_sum
_count
and many_bucket
), as if it is a classic histogram. When getting written out by http output, it serializes into several separate Prometheus metrics, instead of one native histogram. This means all the benefits from native histogram (atomicity, reduced cardinality, better performance) are lost.Additional info
Proposal:
We need to change how prometheusremotewrite parser handles a prom native histogram. It should parse it into one single Telegraf metric.
We need to change how prometheusremotewrite serializer so that it converts back such an Telegraf metrics to a prom native histogram.
A high-level design:
The text was updated successfully, but these errors were encountered: