Skip to content

Upgrade opentelemetry deps from 0.26 -> 0.29 #907

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

connyay
Copy link

@connyay connyay commented Apr 23, 2025

What was changed

This change upgrades opentelemetry deps from 0.26 -> 0.29.

Why?

This change is a bit of yak shaving to prep for upgrading tonic 0.13

Checklist

  1. Closes [Feature Request] Update to OpenTelemetry 0.27 #862

  2. How was this tested:

Modified entries in Cargo.toml -> ran cargo test -> fixed compile errors -> ran cargo test

  1. Any docs updates needed?

No

@connyay connyay requested a review from a team as a code owner April 23, 2025 15:33
@CLAassistant
Copy link

CLAassistant commented Apr 23, 2025

CLA assistant check
All committers have signed the CLA.

Comment on lines -121 to -123
global::set_error_handler(|err| {
tracing::error!("{}", err);
})?;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 328 to 340
let override_resource = Resource::builder().with_attributes(override_kvs).build();
Resource::builder_empty()
.with_attributes(
default_resource_instance()
.iter()
.map(|(k, v)| KeyValue::new(k.clone(), v.clone())),
)
.with_attributes(
override_resource
.iter()
.map(|(k, v)| KeyValue::new(k.clone(), v.clone())),
)
.build()
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

open-telemetry/opentelemetry-rust#2322 👀

the removal of .merge is a bit of a bummer.

see open-telemetry/opentelemetry-rust#2766

@connyay
Copy link
Author

connyay commented Apr 23, 2025

#908 is the real change I'm after. The next opentelemetry release (0.30) includes tonic 0.13

open-telemetry/opentelemetry-rust#2876
https://github.com/open-telemetry/opentelemetry-rust/milestone/21

@connyay connyay marked this pull request as draft April 23, 2025 16:27
@connyay
Copy link
Author

connyay commented Apr 23, 2025

The removal of global::set_error_handler isn't quite right. integration tests ftw. working on it

@Sushisource
Copy link
Member

Sushisource commented Apr 23, 2025

Thanks for trying this out. FYI I had started this before and had to abort because of another OTel bug that was causing the way we try to add global tags to fail: open-telemetry/opentelemetry-rust#2858

Unfortunately either we fix that upstream, or we work around that by manually applying global tags to every metric... which is obnoxious. #882 is the issue on our side

@Sushisource
Copy link
Member

The removal of global::set_error_handler isn't quite right. integration tests ftw. working on it

Oh god, right, and this - there's just AFAICT literally no way to deal with this properly in current versions either: open-telemetry/opentelemetry-rust#2697

We might have to just accept that regression unfortunately, because, yeah, I don't want to never be able to upgrade. OTel upgrades are never the smoothest.

@connyay
Copy link
Author

connyay commented Apr 23, 2025

@Sushisource would you mind kicking workflows one more time if you get a chance?

Thanks for trying this out. FYI I had started this before and had to abort because of another OTel bug that was causing the way we try to add global tags to fail: open-telemetry/opentelemetry-rust#2858

was there a failing integration test for this? I think I got integration tests running and didn't see anything glaring wrong with global tags.

this is the only local failure I see with the integration tests:

failures:

---- integ_tests::metrics_tests::docker_metrics_with_prometheus::otel_collector_1____http___localhost_4318_v1_metrics___OtlpProtocol__Http_ stdout ----

thread 'integ_tests::metrics_tests::docker_metrics_with_prometheus::otel_collector_1____http___localhost_4318_v1_metrics___OtlpProtocol__Http_' panicked at core/../tests/integ_tests/metrics_tests.rs:709:62:
called `Result::unwrap()` on an `Err` value: no http client specified
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    integ_tests::metrics_tests::docker_metrics_with_prometheus::otel_collector_1____http___localhost_4318_v1_metrics___OtlpProtocol__Http_

pretty bizarre. it seems like the reqwest-client feature isn't enabled properly in tests or something?

I ran:

  1. docker compose -f ./docker/docker-compose-ci.yaml up
  2. DOCKER_PROMETHEUS_RUNNING=1 cargo integ-test
  3. DOCKER_PROMETHEUS_RUNNING=1 cargo integ-test -c "--release" -t heavy_tests -- --test-threads 1

I'm not sure if there is another step I was missing

@Sushisource
Copy link
Member

Strange, not sure what that one's about. Also I thought we had these configured so that I only had to approve once but maybe that changed. There's a handful of unrelated flakes I need to find some time to deal with too.

And no -- there's no test for the global tags. I do have this branch though that demonstrates what I needed to change to get them to work (they're not really working properly now anyway), but that making that change, which should work, doesn't - so with that in mind we don't need to block this upgrade on it. I was just delaying things hoping it might get fixed.

@connyay connyay marked this pull request as ready for review April 24, 2025 13:06
@Sushisource
Copy link
Member

Sushisource commented Apr 24, 2025

@connyay

Thanks for this! I realized I think I have a solution to the global error logging problem, if you don't want to take this all on (and I totally get it if not) let me know and I'll try to find some time to add it to this PR, but this is a solution I think will work fine:

We can use https://docs.rs/tracing/latest/tracing/subscriber/fn.set_global_default.html which I didn't appreciate can act like a fallback. It can just use the standard console formatting from here with a simple broad error-level-only filter.

Then, setting that will need to be opt-in (so we don't bother any Rust users using Rust SDK or Core directly who want to set their own global default). So, we can just have some top-level function in the telemetry mod like set_fallback_trace_handler(). We'll make sure to set that in our language SDKs.

As for the test, somewhat annoyingly it might need to be moved out into its own totally separate integration test module so that it can call that function and run in it's own process and not muck up any other tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Update to OpenTelemetry 0.27
3 participants