Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inital opentelemetry layout #400

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
100 changes: 100 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 5 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,11 @@ thiserror = "1.0.30"
lazy_static = "1.4.0"
tracing = "0.1.37"
tracing-subscriber = "0.3.17"
opentelemetry-appender-tracing = { version = "0.27.0", default-features = false}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: I only added the system to forward the tracing crate to opentelemtry, but not other crates

that is to say, I noticed in Dolos (and maybe downstream crates?) you occasionally use use log:: which won't be forwarded

opentelemetry = { version = "0.27.0", default-features = false, features = ["trace"] }
opentelemetry_sdk = { version = "0.27.0", default-features = false, features = ["trace", "rt-tokio"] }
opentelemetry-http = { version = "0.27.0", default-features = false}
opentelemetry-otlp = { version = "0.27.0", features = ["logs", "http-json", "reqwest-client"] }
Comment on lines +38 to +39
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: I used the http+json version of opentelemetry instead of the grpc+protobuf

I did this because we currently only support the http+json approach in Paima at the moment, but plan to support grpc+protobuf in the future

bincode = "1.3.3"
miette = { version = "7.2.0", features = ["fancy"] }
tokio = { version = "^1.40", features = ["rt", "rt-multi-thread", "signal"] }
Expand Down
21 changes: 21 additions & 0 deletions src/bin/dolos/common.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,13 @@ use tokio::task::JoinHandle;
use tokio_util::sync::CancellationToken;
use tracing::{debug, warn};
use tracing_subscriber::{filter::Targets, prelude::*};
use opentelemetry_appender_tracing::layer;
use opentelemetry_otlp::WithExportConfig;
use opentelemetry_otlp::{LogExporter, Protocol};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I only export logs and not traces (spans) or metrics just because I only needed logs to make sure it works

use opentelemetry_sdk::{
logs::LoggerProvider,
runtime,
};

use dolos::prelude::*;

Expand Down Expand Up @@ -62,6 +69,18 @@ pub fn open_data_stores(config: &crate::Config) -> Result<Stores, Error> {
pub fn setup_tracing(config: &LoggingConfig) -> miette::Result<()> {
let level = config.max_level;

let exporter = LogExporter::builder()
.with_http()
.with_endpoint("http://localhost:4318/v1/logs")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: this should be configured automatically for you from an env variable (https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/protocol/exporter.md#endpoint-urls-for-otlphttp), but I just set it explicitly here

.with_protocol(Protocol::HttpJson)
.build().unwrap();

let provider = LoggerProvider::builder()
.with_simple_exporter(exporter)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we want to use with_simple_exporter since it's a lot more responsive in dev mode (and for Paima you're running everything on the same machine so the network overhead is minimal), but for production where you have different machines with_batch_exporter will work a lot better

// .with_batch_exporter(exporter, runtime::Tokio)
.build();
let layer = layer::OpenTelemetryTracingBridge::new(&provider);

let mut filter = Targets::new()
.with_target("dolos", level)
.with_target("gasket", level);
Expand All @@ -84,6 +103,7 @@ pub fn setup_tracing(config: &LoggingConfig) -> miette::Result<()> {
{
tracing_subscriber::registry()
.with(tracing_subscriber::fmt::layer())
.with(layer)
.with(filter)
.init();
}
Expand All @@ -92,6 +112,7 @@ pub fn setup_tracing(config: &LoggingConfig) -> miette::Result<()> {
{
tracing_subscriber::registry()
.with(tracing_subscriber::fmt::layer())
.with(layer)
.with(console_subscriber::spawn())
.with(filter)
.init();
Expand Down
3 changes: 2 additions & 1 deletion src/bin/dolos/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -186,7 +186,8 @@ impl Config {
}
}

fn main() -> Result<()> {
#[tokio::main]
async fn main() -> Result<()> {
Comment on lines +189 to +190
Copy link
Contributor Author

@SebastienGllmt SebastienGllmt Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I don't have this, cargo run sync gives me the error

thread 'main' panicked at /home/sebdev20/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/dns.rs:121:24:
there is no reactor running, must be called from the context of a Tokio 1.x runtime
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Note that I don't need this to run cargo run daemon - that one instead starts but then deadlocks right away. I tried switching to with_batch_exporter which doesn't deadlock, but instead of doesn't seem to actually send anything to opentelemetry either (not sure why)

Copy link
Contributor Author

@SebastienGllmt SebastienGllmt Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize now that unfortunately this breaks cargo run daemon with the following error

thread 'main' panicked at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/scheduler/multi_thread/mod.rs:86:9:
Cannot start a runtime from within a runtime. This happens because a function (like `block_on`) attempted to block the current thread while the thread is being used to drive asynchronous tasks.

note: cargo run sync still works

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They recently (December 2024) updated the docs on how to get the SimpleExporter working to avoid the deadlock mentioned earlier

However, it turns out that, unfortunately, using the SimpleExporter requires a fairly different setup than the batch exporter:

SimpleExporter requires enabling feature flag reqwest-blocking-client and making the main() a normal main and not tokio::main

However, I can't seems to get it to work because

There are some issues that were resolved in opentelemetry that sound like they might be relevant such as open-telemetry/opentelemetry-rust#2431, but they are not released yet. More generally, I wonder if open-telemetry/opentelemetry-rust#2386 might make the integration easier

let args = Cli::parse();
let config = Config::new(&args.config)
.into_diagnostic()
Expand Down