Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prometheusremotewrite: End-to-end Prometheus Native Histogram Support #16120

Open
Reimirno opened this issue Oct 31, 2024 · 0 comments · May be fixed by #16121
Open

prometheusremotewrite: End-to-end Prometheus Native Histogram Support #16120

Reimirno opened this issue Oct 31, 2024 · 0 comments · May be fixed by #16121
Labels
feature request Requests for new plugin and for new features to existing plugins

Comments

@Reimirno
Copy link

Reimirno commented Oct 31, 2024

TLDR

Telegraf prometheusremotewrite data format parser Prometheus native histogram into one single Telegraf metric (instead of multiple Telegraf metrics), and its serializer should be able to serialize it back to a Prometheus native histogram. Design at the end.

Use Case

We are on a Prometheus stack and is planning to use Telegraf on data ingestion path (for some aggregation). This is a simplified view of our design.

Pods ---(get scraped)---> Agent (Prometheus Agent/Grafana Agent) ---(remote write)---> Telegraf ---(remote write)---> TSDB ....

Some of our metrics are native histogram, a new histogram model introduced by Prometheus. Rather than getting emitted as several metrics (_sum, _count, many _bucket with les) it encodes a protobuf struct and emits a single time series. It not only guarantees atomicity and thus resolves the writing batch problem that's present in classic histogram but also offers better resolution, query accuracy at a lower cost.

I PoC-ed a simple Telegraf ingest and output (aggregation logic not added yet) and put it in out ingestion path. Native histogram metrics are only available in protobuf exposition format - so prometheusremotewrite data format seems the right choice. Important configs are:

[[inputs.http_listener_v2]]
      alias = "prom-ingest"
      service_address = ":9201"
      paths = ["/receive"]
      methods = ["GET", "OPTIONS", "POST", "PUT"]
      data_format = "prometheusremotewrite"

[[outputs.http]]
      alias = "prom-write"
      url = "%(write_url)s"
      timeout = "10s"
      data_format = "prometheusremotewrite"

      [outputs.http.headers]
         Content-Type = "application/x-protobuf"
         Content-Encoding = "snappy"
         X-Prometheus-Remote-Write-Version = "2.0.0"

Expected behavior

  • http_listener_v2 prometheusremotewrite format: a native histogram should be ingested and parsed into one single Telegraf metric, not breaking its atomicity.
  • http output prometheusremotewrite format: a native histogram should be written out, if a native histogram is ingested.

How exactly a native histogram metric should be parsed into one single Telegraf metric (data representation) is worth a design, so that it is:

  • not difficult for writing aggregators (starlark etc) for it
  • (even better) amenable to existing processors
  • (even better) reusable logic to openmetrics exponential histogram support

Actual behavior

Currently, support for ingesting native histogram is implemented in this PR: #14952
This causes the parser to break down a native histogram into many Telegraf metrics (_sum _count and many _bucket), as if it is a classic histogram. When getting written out by http output, it serializes into several separate Prometheus metrics, instead of one native histogram. This means all the benefits from native histogram (atomicity, reduced cardinality, better performance) are lost.

Additional info

Proposal:
We need to change how prometheusremotewrite parser handles a prom native histogram. It should parse it into one single Telegraf metric.
We need to change how prometheusremotewrite serializer so that it converts back such an Telegraf metrics to a prom native histogram.

A high-level design:
image

@Reimirno Reimirno added the feature request Requests for new plugin and for new features to existing plugins label Oct 31, 2024
@Reimirno Reimirno changed the title End-to-end Prometheus Native Histogram Support prometheusremotewrite: End-to-end Prometheus Native Histogram Support Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Requests for new plugin and for new features to existing plugins
Projects
None yet
1 participant