Any plans on considering tailsampling traces? #213

mugli · 2025-09-27T18:10:45Z

mugli
Sep 27, 2025

OTel collector (contrib) is a vast ecosystem, it's probably not worth trying to rewrite all of that for performance reasons. But Rotel could shine on resource constrained envs (like Lambda, you folks are working on that already), and additionally, (here is the pitch), where OTel collectors need to deal with extremely high volume of data. Tailsampling is one of that area. 💡

[1]
We maintain a tailsampling pipeline with OTel collector. It works fine, but at scale a performance focused alternative collector components for tailsampling (loadbalancing exporter and tailsampling processor) could be potentially very interesting.

[2]
Besides performance, here's another interesting take:

The better a language is at configuration, the worse it is at data transformation.

Since Rotel is allowing Python based SDKs for creating custom processors, different sampling configuration for tailsampling processor could be very good fit for that, and more flexible.

[3]
Third and a very important reason that could make it worthwhile working on an alternative to official OTel collector is this: open-telemetry/opentelemetry-collector-contrib#33568

OTel maintainers mentioned they have no plans to address this (understandibly because it will require rethinking the loadbalancing exporter architecture).

To summarize the issue, OTel loadbalancing exporters are stateless and don't communicate with each other or with the tailsampler collectors. The loadbalancer works by using consistent hashing on trace_id to distribute the spans to the backend tailsampling collectors, so that all spans with the same trace_id reach the same destination.

But the unfortunate oversight was that traces from async messaging systems don't always share the same trace_id, but can use Span Links instead (an example with Kafka). Currently there's no way to ensure if a particular trace is sampled in the OTel tailsampler, all related linked traces will be sampled as well.

[4]
And there is more areas for improvement on autoscaling the tailsampling collectors, like open-telemetry/opentelemetry-collector-contrib#36717

All these are hard and interesting problems, currently without any good solutions. Vector seems to be looking into tailsampling use case as well. But they just started adding OTLP source support, so I assume it will be a long road if they handle tailsampling.

I don't know what's the roadmap for Rotel is, it's still early days. I'm happy to be a PoC subject if you folks consider supporting tailsampling. Let me know if you have any questions.

rjenkins · 2025-09-27T18:40:06Z

rjenkins
Sep 27, 2025
Maintainer

Thanks for reaching out and writing this up @mugli. So yes, we've also identified tailsampling at scale as a great use case for Rotel. We're tracking it internally but hadn't opened a GH issue yet as no one has asked. I've opened one here now for us to track #214.

Re: [2] for tailsampling we've batted around the idea of writing it as a pure Rust processor rather than with the Python processor SDK. The processor SDK is very fast (as it's a Rust backed extension) and in many cases much faster than Go processors, however we were thinking in regards to tailsampling at scale, it would be best to provide the most resource efficient and high performance option. Open to discussing this though and even potentially offering two options here.

Re: [3] We've definitely considered stateful use cases like this, but have yet to implement. This is a good potential first one we could address.

I think it would be great to collaborate on tailsampling and would love to have you PoC it. If you're interested, hop in the Discord https://rotel.dev/discord and we can discuss more this week.

2 replies

mugli Sep 27, 2025
Author

Amazing! Joined the discord already 🎉

rjenkins Sep 27, 2025
Maintainer

👍. If you want to start a new post in #discussions this week and @ me (@ramond in Discord) that would be great, we can toss ideas around and uncover requirements to get an RFC together. I imagine it's possible for a first pass at this, we can provide a subset of features for a tailsampling processor that specifically meet your immediate needs, and then break out additional future work.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Any plans on considering tailsampling traces? #213

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Any plans on considering tailsampling traces? #213

Uh oh!

mugli Sep 27, 2025

Replies: 1 comment · 2 replies

Uh oh!

Uh oh!

rjenkins Sep 27, 2025 Maintainer

Uh oh!

mugli Sep 27, 2025 Author

Uh oh!

rjenkins Sep 27, 2025 Maintainer

mugli
Sep 27, 2025

Replies: 1 comment 2 replies

rjenkins
Sep 27, 2025
Maintainer

mugli Sep 27, 2025
Author

rjenkins Sep 27, 2025
Maintainer