feat!(telemetry): OpenTelemetry (OTLP) metrics exporter for token compression tracking

## Summary

Add an optional OpenTelemetry (OTel) exporter so that enterprises can feed tokf's token compression metrics into their existing observability stacks (Datadog, Grafana, Jaeger, Prometheus, etc.) via the standard OTLP protocol, without building custom integrations.

## Motivation

Issue #20 proposed an organization server for centralized tracking. An OTel exporter is a lighter-weight, standards-based alternative that **does not require self-hosting a dedicated server**. Most enterprises already run an OTel Collector or a compatible backend. By emitting metrics via OTLP, tokf becomes a first-class citizen in existing observability pipelines — teams get dashboards, alerts, and cost attribution "for free."

This also positions tokf within the emerging **OTel GenAI semantic conventions** (semconv v1.39.0, status: Development), which define standard metrics like `gen_ai.client.token.usage` and `gen_ai.client.operation.duration`. While tokf is not an LLM client itself, it operates on LLM-bound context and can align with these conventions where applicable, extending them with tokf-specific attributes for compression effectiveness.

Key use cases:

- **Cost management** — understanding how much context compression saves across builds, pipelines, and teams
- **Performance monitoring** — tracking filter effectiveness over time, detecting regressions when build output formats change
- **Capacity planning** — aggregating token usage across an organization to forecast LLM API costs
- **Compliance & auditing** — providing an immutable telemetry trail of token processing for enterprise governance

## Background: OTel Rust Ecosystem

The Rust OTel SDK is mature enough for production use:

- **`opentelemetry`** (API crate) — traits and no-op implementation for instrumentation
- **`opentelemetry-sdk`** — real SDK with Metrics SDK, Tracing SDK
- **`opentelemetry-otlp`** — OTLP exporter supporting both gRPC (tonic) and HTTP (reqwest), for metrics, traces, and logs
- **`opentelemetry-prometheus`** — Prometheus metrics exporter

OTLP is the recommended export protocol and is natively supported by all major backends (Datadog, Grafana Cloud, New Relic, Honeycomb, Jaeger, etc.) as well as self-hosted OTel Collectors.

## Proposed Design

### Feature Flag

Gate the entire feature behind a Cargo feature flag to keep the default binary lean:

```toml
[features]
default = []
otel = ["otel-http"]  # default to HTTP for simplicity
otel-http = ["opentelemetry", "opentelemetry-sdk", "opentelemetry-otlp/http-proto", "opentelemetry-otlp/reqwest-blocking-client"]
otel-grpc = ["opentelemetry", "opentelemetry-sdk", "opentelemetry-otlp/grpc-tonic"]
```

HTTP is recommended as the default — it's simpler, firewall-friendly, and sufficient for CLI tools that export metrics on each invocation rather than as a long-running service.

### Metrics to Export

Following the [OTel GenAI Semantic Conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-metrics/) (v1.39.0, Development status) where applicable, and extending with `tokf`-specific metrics:

#### Standard GenAI-aligned metrics (reused)

| Metric | Instrument | Unit | Description |
|--------|-----------|------|-------------|
| `gen_ai.client.token.usage` | Histogram | `{token}` | Token counts with `gen_ai.token.type` = `input` (pre-filter) and `output` (post-filter) |

#### tokf-specific metrics (custom namespace)

| Metric | Instrument | Unit | Description |
|--------|-----------|------|-------------|
| `tokf.filter.input_lines` | Counter | `{line}` | Total lines received before filtering |
| `tokf.filter.output_lines` | Counter | `{line}` | Total lines emitted after filtering |
| `tokf.filter.lines_removed` | Counter | `{line}` | Lines removed by the filter |
| `tokf.compression.ratio` | Gauge | `1` | Ratio of output to input (0.0–1.0); lower = more compression |
| `tokf.tokens.saved` | Counter | `{token}` | Cumulative tokens saved across invocations |
| `tokf.filter.duration` | Histogram | `s` | Time spent in the filter pipeline |
| `tokf.filter.invocations` | Counter | `{invocation}` | Number of filter invocations |

#### Attributes (dimensions)

| Attribute | Type | Description | Example |
|-----------|------|-------------|---------|
| `tokf.filter.name` | string | The filter that was applied | `cargo/build`, `jest/run` |
| `tokf.command` | string | The wrapped command | `cargo build`, `npx jest` |
| `tokf.exit_code` | int | Exit code of the wrapped command | `0`, `1` |
| `tokf.version` | string | tokf version | `0.1.8` |
| `tokf.pipeline` | string | User-supplied pipeline/job identifier | `my-ci-job` |
| `service.name` | string | Defaults to `tokf`, configurable | `tokf` |

### Configuration

OTel export should be entirely opt-in, configurable via `~/.tokf/config.toml`, CLI flags, and/or environment variables.

#### Config file

```toml
[telemetry.otlp]
enabled = true
endpoint = "http://localhost:4317"      # OTel Collector gRPC endpoint
protocol = "grpc"                        # "grpc" | "http"
# headers = { "api-key" = "secret" }    # optional auth headers

[telemetry.otlp.resource]
service_name = "tokf"
# deployment.environment = "production"
```

#### Environment variables (override config)

Standard OTel env vars should be respected:

```bash
# Enable OTLP export
tokf --otel-export

# Configure endpoint (defaults to localhost:4317 for gRPC, localhost:4318 for HTTP)
OTEL_EXPORTER_OTLP_ENDPOINT=https://otel-collector.internal:4317

# Protocol selection
OTEL_EXPORTER_OTLP_PROTOCOL=grpc   # or http/protobuf

# Custom headers (for vendor auth)
OTEL_EXPORTER_OTLP_HEADERS=x-api-key=secret123

# Resource attributes
OTEL_RESOURCE_ATTRIBUTES=service.name=tokf,deployment.environment=production

# Master toggle
TOKF_TELEMETRY_ENABLED=true

# tokf-specific
TOKF_OTEL_PIPELINE=my-ci-job
```

#### Default: disabled

When no config is set and `TOKF_TELEMETRY_ENABLED` is not `true`, tokf behaves exactly as it does today — no OTel dependencies are loaded, no network calls are made.

### Architecture

- Add a `telemetry` module with a trait `TelemetryReporter` and two implementations:
  - `NoopReporter` — used when OTel is disabled (compiles to nothing when `otel` feature is off)
  - `OtelReporter` — initializes the OTel `SdkMeterProvider` with an OTLP `MetricExporter` and records metrics
- The reporter is initialized once at startup based on config and passed to the filter execution path
- Metrics are recorded after each filter invocation
- The `MeterProvider` is shut down gracefully on process exit to flush pending metrics

### Graceful Degradation

If the OTel endpoint is unreachable, tokf must **never block or slow down** the wrapped command. The OTel SDK's periodic exporter handles this (failed exports are logged and retried). tokf should set a short export timeout (e.g. 5s) and move on.

### Rust Implementation

Key crates:
- `opentelemetry` (v0.28+) — API traits
- `opentelemetry_sdk` (v0.28+) — SDK implementation with `MeterProvider`
- `opentelemetry-otlp` (v0.31+) — OTLP exporter via HTTP (`http-proto` feature) or gRPC (`grpc-tonic` feature)

## Homebrew Distribution Strategy

Homebrew doesn't support Cargo feature flags at install time — each formula produces a single binary. The recommended approach is to **build the Homebrew formula with OTel enabled by default**, since telemetry export is already gated at runtime behind explicit flags (`--otel-export` / `OTEL_EXPORTER_OTLP_ENDPOINT` / `TOKF_TELEMETRY_ENABLED`).

### Recommended: OTel-enabled formula (runtime opt-in)

```ruby
class Tokf < Formula
  desc "Token context compression for LLM pipelines"
  homepage "https://github.com/mpecan/tokf"
  # ...

  depends_on "rust" => :build

  def install
    system "cargo", "install", *std_cargo_args, "--features", "otel"
  end
end
```

Users who don't set any OTel environment variables or pass `--otel-export` will see zero behavioral difference — the OTel code paths are never activated. The only cost is a slightly larger binary (expected ~2-5 MB from the OTel + HTTP client dependency tree).

### Alternative: Separate formulae (if binary size is a concern)

If benchmarking shows an unacceptable binary size increase, offer two formulae:

```ruby
# tokf.rb — lean, no OTel
class Tokf < Formula
  def install
    system "cargo", "install", *std_cargo_args
  end
end

# tokf-otel.rb — with OTel support
class TokfOtel < Formula
  conflicts_with "tokf", because: "both install a `tokf` binary"
  def install
    system "cargo", "install", *std_cargo_args, "--features", "otel"
  end
end
```

```bash
# Users choose at install time:
brew install tokf          # lean
brew install tokf-otel     # with telemetry support
```

### Decision criteria

Measure the binary size delta before deciding:

```bash
# Without OTel
cargo build --release && ls -lh target/release/tokf

# With OTel
cargo build --release --features otel && ls -lh target/release/tokf
```

If the delta is under ~5 MB, the single OTel-enabled formula is the pragmatic choice (this is the pattern ripgrep follows with PCRE2 in its Homebrew formula). If larger, consider the separate formulae approach.

### cargo install users

Users installing via `cargo install` can always choose:

```bash
# Lean install
cargo install tokf

# With OTel
cargo install tokf --features otel

# With OTel + gRPC
cargo install tokf --features otel,otel-grpc
```

## Integration Examples

### Datadog via OTel Collector

```yaml
# otel-collector-config.yaml
receivers:
  otlp:
    protocols:
      http:
        endpoint: 0.0.0.0:4318
exporters:
  datadog:
    api:
      key: ${DD_API_KEY}
service:
  pipelines:
    metrics:
      receivers: [otlp]
      exporters: [datadog]
```

### Prometheus (direct scrape via push gateway)

```bash
OTEL_EXPORTER_OTLP_ENDPOINT=http://prometheus-pushgateway:4318 tokf --otel-export cargo build 2>&1
```

### Grafana Cloud

```bash
OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp-gateway-prod.grafana.net/otlp \
OTEL_EXPORTER_OTLP_HEADERS="Authorization=Basic $(echo -n 'instance_id:api_key' | base64)" \
tokf --otel-export cargo build 2>&1
```

## Relationship to #20

This issue offers a **complementary** approach to the org server proposed in #20:

- **OTel exporter**: standards-based, zero infrastructure for orgs already running OTel Collector or a compatible backend
- **Org server (#20)**: custom aggregation, purpose-built dashboards, potentially simpler for teams without existing observability infrastructure

Both can coexist — the OTel exporter ships metrics to whatever backend the org uses, while the org server (if built) could itself accept OTLP input from tokf instances.

## Acceptance Criteria

- [ ] `otel` Cargo feature flag compiles in OTel dependencies only when enabled
- [ ] `TelemetryReporter` trait with `NoopReporter` and `OtelReporter` implementations
- [ ] `OtelReporter` initializes `SdkMeterProvider` with OTLP exporter (gRPC and HTTP)
- [ ] All tokf-specific metrics recorded after each filter invocation
- [ ] Standard `gen_ai.client.token.usage` metric emitted with correct attributes
- [ ] Standard OTel env vars (`OTEL_EXPORTER_OTLP_ENDPOINT`, etc.) are respected
- [ ] `~/.tokf/config.toml` `[telemetry.otlp]` section parsed and applied
- [ ] Master toggle `TOKF_TELEMETRY_ENABLED` controls opt-in
- [ ] Default behavior (no config) = no OTel, no network calls, no extra deps
- [ ] OTel endpoint failures do not block or slow the wrapped command
- [ ] Graceful shutdown flushes pending metrics
- [ ] Homebrew formula builds with OTel enabled (runtime opt-in via `--otel-export`)
- [ ] Binary size delta with OTel is documented and acceptable
- [ ] Integration test: verify metrics emitted to an in-memory exporter with correct attributes
- [ ] README section documenting OTel setup with example Collector + Grafana/Prometheus config

## References

- [OTel GenAI Semantic Conventions — Metrics](https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-metrics/)
- [OTel Rust SDK](https://github.com/open-telemetry/opentelemetry-rust)
- [`opentelemetry-otlp` crate](https://crates.io/crates/opentelemetry-otlp) (v0.31.0)
- [OTel GenAI blog post](https://opentelemetry.io/blog/2024/otel-generative-ai/)
- [Datadog OTel GenAI SemConv support](https://www.datadoghq.com/blog/llm-otel-semantic-convention/)

---

> **Note:** This issue supersedes #82 which covered the same feature.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat!(telemetry): OpenTelemetry (OTLP) metrics exporter for token compression tracking #85

Summary

Motivation

Background: OTel Rust Ecosystem

Proposed Design

Feature Flag

Metrics to Export

Standard GenAI-aligned metrics (reused)

tokf-specific metrics (custom namespace)

Attributes (dimensions)

Configuration

Config file

Environment variables (override config)

Default: disabled

Architecture

Graceful Degradation

Rust Implementation

Homebrew Distribution Strategy

Recommended: OTel-enabled formula (runtime opt-in)

Alternative: Separate formulae (if binary size is a concern)

Decision criteria

cargo install users

Integration Examples

Datadog via OTel Collector

Prometheus (direct scrape via push gateway)

Grafana Cloud

Relationship to #20

Acceptance Criteria

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Metric	Instrument	Unit	Description
`tokf.filter.input_lines`	Counter	`{line}`	Total lines received before filtering
`tokf.filter.output_lines`	Counter	`{line}`	Total lines emitted after filtering
`tokf.filter.lines_removed`	Counter	`{line}`	Lines removed by the filter
`tokf.compression.ratio`	Gauge	`1`	Ratio of output to input (0.0–1.0); lower = more compression
`tokf.tokens.saved`	Counter	`{token}`	Cumulative tokens saved across invocations
`tokf.filter.duration`	Histogram	`s`	Time spent in the filter pipeline
`tokf.filter.invocations`	Counter	`{invocation}`	Number of filter invocations

Attribute	Type	Description	Example
`tokf.filter.name`	string	The filter that was applied	`cargo/build`, `jest/run`
`tokf.command`	string	The wrapped command	`cargo build`, `npx jest`
`tokf.exit_code`	int	Exit code of the wrapped command	`0`, `1`
`tokf.version`	string	tokf version	`0.1.8`
`tokf.pipeline`	string	User-supplied pipeline/job identifier	`my-ci-job`
`service.name`	string	Defaults to `tokf`, configurable	`tokf`

feat!(telemetry): OpenTelemetry (OTLP) metrics exporter for token compression tracking #85

Description

Summary

Motivation

Background: OTel Rust Ecosystem

Proposed Design

Feature Flag

Metrics to Export

Standard GenAI-aligned metrics (reused)

tokf-specific metrics (custom namespace)

Attributes (dimensions)

Configuration

Config file

Environment variables (override config)

Default: disabled

Architecture

Graceful Degradation

Rust Implementation

Homebrew Distribution Strategy

Recommended: OTel-enabled formula (runtime opt-in)

Alternative: Separate formulae (if binary size is a concern)

Decision criteria

cargo install users

Integration Examples

Datadog via OTel Collector

Prometheus (direct scrape via push gateway)

Grafana Cloud

Relationship to #20

Acceptance Criteria

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions