-
Notifications
You must be signed in to change notification settings - Fork 11
Description
Summary
Add an optional OpenTelemetry (OTel) exporter so that enterprises can feed tokf's token compression metrics into their existing observability stacks (Datadog, Grafana, Jaeger, Prometheus, etc.) via the standard OTLP protocol, without building custom integrations.
Motivation
Issue #20 proposed an organization server for centralized tracking. An OTel exporter is a lighter-weight, standards-based alternative that does not require self-hosting a dedicated server. Most enterprises already run an OTel Collector or a compatible backend. By emitting metrics via OTLP, tokf becomes a first-class citizen in existing observability pipelines — teams get dashboards, alerts, and cost attribution "for free."
This also positions tokf within the emerging OTel GenAI semantic conventions (semconv v1.39.0, status: Development), which define standard metrics like gen_ai.client.token.usage and gen_ai.client.operation.duration. While tokf is not an LLM client itself, it operates on LLM-bound context and can align with these conventions where applicable, extending them with tokf-specific attributes for compression effectiveness.
Key use cases:
- Cost management — understanding how much context compression saves across builds, pipelines, and teams
- Performance monitoring — tracking filter effectiveness over time, detecting regressions when build output formats change
- Capacity planning — aggregating token usage across an organization to forecast LLM API costs
- Compliance & auditing — providing an immutable telemetry trail of token processing for enterprise governance
Background: OTel Rust Ecosystem
The Rust OTel SDK is mature enough for production use:
opentelemetry(API crate) — traits and no-op implementation for instrumentationopentelemetry-sdk— real SDK with Metrics SDK, Tracing SDKopentelemetry-otlp— OTLP exporter supporting both gRPC (tonic) and HTTP (reqwest), for metrics, traces, and logsopentelemetry-prometheus— Prometheus metrics exporter
OTLP is the recommended export protocol and is natively supported by all major backends (Datadog, Grafana Cloud, New Relic, Honeycomb, Jaeger, etc.) as well as self-hosted OTel Collectors.
Proposed Design
Feature Flag
Gate the entire feature behind a Cargo feature flag to keep the default binary lean:
[features]
default = []
otel = ["otel-http"] # default to HTTP for simplicity
otel-http = ["opentelemetry", "opentelemetry-sdk", "opentelemetry-otlp/http-proto", "opentelemetry-otlp/reqwest-blocking-client"]
otel-grpc = ["opentelemetry", "opentelemetry-sdk", "opentelemetry-otlp/grpc-tonic"]HTTP is recommended as the default — it's simpler, firewall-friendly, and sufficient for CLI tools that export metrics on each invocation rather than as a long-running service.
Metrics to Export
Following the OTel GenAI Semantic Conventions (v1.39.0, Development status) where applicable, and extending with tokf-specific metrics:
Standard GenAI-aligned metrics (reused)
| Metric | Instrument | Unit | Description |
|---|---|---|---|
gen_ai.client.token.usage |
Histogram | {token} |
Token counts with gen_ai.token.type = input (pre-filter) and output (post-filter) |
tokf-specific metrics (custom namespace)
| Metric | Instrument | Unit | Description |
|---|---|---|---|
tokf.filter.input_lines |
Counter | {line} |
Total lines received before filtering |
tokf.filter.output_lines |
Counter | {line} |
Total lines emitted after filtering |
tokf.filter.lines_removed |
Counter | {line} |
Lines removed by the filter |
tokf.compression.ratio |
Gauge | 1 |
Ratio of output to input (0.0–1.0); lower = more compression |
tokf.tokens.saved |
Counter | {token} |
Cumulative tokens saved across invocations |
tokf.filter.duration |
Histogram | s |
Time spent in the filter pipeline |
tokf.filter.invocations |
Counter | {invocation} |
Number of filter invocations |
Attributes (dimensions)
| Attribute | Type | Description | Example |
|---|---|---|---|
tokf.filter.name |
string | The filter that was applied | cargo/build, jest/run |
tokf.command |
string | The wrapped command | cargo build, npx jest |
tokf.exit_code |
int | Exit code of the wrapped command | 0, 1 |
tokf.version |
string | tokf version | 0.1.8 |
tokf.pipeline |
string | User-supplied pipeline/job identifier | my-ci-job |
service.name |
string | Defaults to tokf, configurable |
tokf |
Configuration
OTel export should be entirely opt-in, configurable via ~/.tokf/config.toml, CLI flags, and/or environment variables.
Config file
[telemetry.otlp]
enabled = true
endpoint = "http://localhost:4317" # OTel Collector gRPC endpoint
protocol = "grpc" # "grpc" | "http"
# headers = { "api-key" = "secret" } # optional auth headers
[telemetry.otlp.resource]
service_name = "tokf"
# deployment.environment = "production"Environment variables (override config)
Standard OTel env vars should be respected:
# Enable OTLP export
tokf --otel-export
# Configure endpoint (defaults to localhost:4317 for gRPC, localhost:4318 for HTTP)
OTEL_EXPORTER_OTLP_ENDPOINT=https://otel-collector.internal:4317
# Protocol selection
OTEL_EXPORTER_OTLP_PROTOCOL=grpc # or http/protobuf
# Custom headers (for vendor auth)
OTEL_EXPORTER_OTLP_HEADERS=x-api-key=secret123
# Resource attributes
OTEL_RESOURCE_ATTRIBUTES=service.name=tokf,deployment.environment=production
# Master toggle
TOKF_TELEMETRY_ENABLED=true
# tokf-specific
TOKF_OTEL_PIPELINE=my-ci-jobDefault: disabled
When no config is set and TOKF_TELEMETRY_ENABLED is not true, tokf behaves exactly as it does today — no OTel dependencies are loaded, no network calls are made.
Architecture
- Add a
telemetrymodule with a traitTelemetryReporterand two implementations:NoopReporter— used when OTel is disabled (compiles to nothing whenotelfeature is off)OtelReporter— initializes the OTelSdkMeterProviderwith an OTLPMetricExporterand records metrics
- The reporter is initialized once at startup based on config and passed to the filter execution path
- Metrics are recorded after each filter invocation
- The
MeterProvideris shut down gracefully on process exit to flush pending metrics
Graceful Degradation
If the OTel endpoint is unreachable, tokf must never block or slow down the wrapped command. The OTel SDK's periodic exporter handles this (failed exports are logged and retried). tokf should set a short export timeout (e.g. 5s) and move on.
Rust Implementation
Key crates:
opentelemetry(v0.28+) — API traitsopentelemetry_sdk(v0.28+) — SDK implementation withMeterProvideropentelemetry-otlp(v0.31+) — OTLP exporter via HTTP (http-protofeature) or gRPC (grpc-tonicfeature)
Homebrew Distribution Strategy
Homebrew doesn't support Cargo feature flags at install time — each formula produces a single binary. The recommended approach is to build the Homebrew formula with OTel enabled by default, since telemetry export is already gated at runtime behind explicit flags (--otel-export / OTEL_EXPORTER_OTLP_ENDPOINT / TOKF_TELEMETRY_ENABLED).
Recommended: OTel-enabled formula (runtime opt-in)
class Tokf < Formula
desc "Token context compression for LLM pipelines"
homepage "https://github.com/mpecan/tokf"
# ...
depends_on "rust" => :build
def install
system "cargo", "install", *std_cargo_args, "--features", "otel"
end
endUsers who don't set any OTel environment variables or pass --otel-export will see zero behavioral difference — the OTel code paths are never activated. The only cost is a slightly larger binary (expected ~2-5 MB from the OTel + HTTP client dependency tree).
Alternative: Separate formulae (if binary size is a concern)
If benchmarking shows an unacceptable binary size increase, offer two formulae:
# tokf.rb — lean, no OTel
class Tokf < Formula
def install
system "cargo", "install", *std_cargo_args
end
end
# tokf-otel.rb — with OTel support
class TokfOtel < Formula
conflicts_with "tokf", because: "both install a `tokf` binary"
def install
system "cargo", "install", *std_cargo_args, "--features", "otel"
end
end# Users choose at install time:
brew install tokf # lean
brew install tokf-otel # with telemetry supportDecision criteria
Measure the binary size delta before deciding:
# Without OTel
cargo build --release && ls -lh target/release/tokf
# With OTel
cargo build --release --features otel && ls -lh target/release/tokfIf the delta is under ~5 MB, the single OTel-enabled formula is the pragmatic choice (this is the pattern ripgrep follows with PCRE2 in its Homebrew formula). If larger, consider the separate formulae approach.
cargo install users
Users installing via cargo install can always choose:
# Lean install
cargo install tokf
# With OTel
cargo install tokf --features otel
# With OTel + gRPC
cargo install tokf --features otel,otel-grpcIntegration Examples
Datadog via OTel Collector
# otel-collector-config.yaml
receivers:
otlp:
protocols:
http:
endpoint: 0.0.0.0:4318
exporters:
datadog:
api:
key: ${DD_API_KEY}
service:
pipelines:
metrics:
receivers: [otlp]
exporters: [datadog]Prometheus (direct scrape via push gateway)
OTEL_EXPORTER_OTLP_ENDPOINT=http://prometheus-pushgateway:4318 tokf --otel-export cargo build 2>&1Grafana Cloud
OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp-gateway-prod.grafana.net/otlp \
OTEL_EXPORTER_OTLP_HEADERS="Authorization=Basic $(echo -n 'instance_id:api_key' | base64)" \
tokf --otel-export cargo build 2>&1Relationship to #20
This issue offers a complementary approach to the org server proposed in #20:
- OTel exporter: standards-based, zero infrastructure for orgs already running OTel Collector or a compatible backend
- Org server (feat(tracking): organization server for collecting token tracking data #20): custom aggregation, purpose-built dashboards, potentially simpler for teams without existing observability infrastructure
Both can coexist — the OTel exporter ships metrics to whatever backend the org uses, while the org server (if built) could itself accept OTLP input from tokf instances.
Acceptance Criteria
-
otelCargo feature flag compiles in OTel dependencies only when enabled -
TelemetryReportertrait withNoopReporterandOtelReporterimplementations -
OtelReporterinitializesSdkMeterProviderwith OTLP exporter (gRPC and HTTP) - All tokf-specific metrics recorded after each filter invocation
- Standard
gen_ai.client.token.usagemetric emitted with correct attributes - Standard OTel env vars (
OTEL_EXPORTER_OTLP_ENDPOINT, etc.) are respected -
~/.tokf/config.toml[telemetry.otlp]section parsed and applied - Master toggle
TOKF_TELEMETRY_ENABLEDcontrols opt-in - Default behavior (no config) = no OTel, no network calls, no extra deps
- OTel endpoint failures do not block or slow the wrapped command
- Graceful shutdown flushes pending metrics
- Homebrew formula builds with OTel enabled (runtime opt-in via
--otel-export) - Binary size delta with OTel is documented and acceptable
- Integration test: verify metrics emitted to an in-memory exporter with correct attributes
- README section documenting OTel setup with example Collector + Grafana/Prometheus config
References
- OTel GenAI Semantic Conventions — Metrics
- OTel Rust SDK
opentelemetry-otlpcrate (v0.31.0)- OTel GenAI blog post
- Datadog OTel GenAI SemConv support
Note: This issue supersedes #82 which covered the same feature.