Skip to content

OpenTelemetry integration #1240

@meskill

Description

@meskill

Description

Provide integration with opentelemetry for taicall with support for different exporters and configurations depending on users needs.

User perspective

If user is not interested in opentelemetry the tailcall should work as before and no additional actions for user should be done.

If user wants to enable opentelemetry output from tailcall they can use new directive on schema @opentelemetry that specifies settings where to export data and in which format.

Example of config:

schema
  @server(port: 8000, graphiql: true, hostname: "0.0.0.0")
  @upstream(baseURL: "http://jsonplaceholder.typicode.com", httpCache: true)
  @opentelemetry(
    export: {
      otlp: {
        url: "https://api.honeycomb.io:443"
        # gather api key from https://ui.honeycomb.io and set it as env when running tailcall
        headers: [{key: "x-honeycomb-team", value: "{{env.HONEYCOMB_API_KEY}}"}]
      }
    }
  ) {
  query: Query
}

In that case opentelemetry data from taillcall will be exported to the provided service and the responsibility to aggregate and process that data is on that external service

Development perspective

Opentelemetry provides various Rust crates that implements different aspects of integration into the app.

Core

Core should be able to generate any opentelemetry data when needed in simple way preferably without any feature flags inside the code.

For tracing and logs we can use tracing crate instead of log. Benefits of it is that tracing manages traces and logs already, have built-in methods to create different wrappers and the data from it could be exported as opentelemetry data with tracing-opentelemetry crate.

For metrics we can't use tracing and have to use opentelemetry crates functionality explicitly. It should use available functionality to send data from opentelemetry core that is not tied to specific exporters

CLI/Native app

The specific environment should define exporters based on the passed configuration. This is done mostly by specific crates for opentelemetry.

The first implementation should start with a couple of available integration and should be easily extensible by additional options in the future.

  • integrate opentelemetry_stdout
  • integrate opentelemetry_otlp

WASM

Performance

Initial integration with 2 spans and 1 metric doesn't show significant changes in performance.

But using async-graphql::extensions::OpenTelemetry reduces overall RPS for benchmark by 30%, but it outputs a lot of spans with most of them are basically no-op function for fields with no resolvers. That's probably could be stripped in some way or ignored.

Testing

  • implement integration test to verify that opentelemetry data is captured

Metadata

Metadata

Assignees

No one assigned

    Labels

    state: inactiveNo current action needed/possible; issue fixed, out of scope, or superseded.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions