Usage Tracking Issue: Token Counts Always Zero

## Problem

All inference responses return zero token counts (`prompt_tokens=0`, `completion_tokens=0`, `total_tokens=0`) in `client.queryFinalInfo`, regardless of actual token consumption.

## Root Cause

The vLLM worker returns `"usage": null` in all SSE chunks, including the final one with `finish_reason: "stop"`. The proxy's `ByteTokenCounter` parses SSE JSON looking for `usage.prompt_tokens` etc., but since the value is `null` (not an object), `get_json_value` returns 0 and all counters stay at zero.

## Observed SSE Response

```
data: {"id":"...","object":"chat.completion.chunk","created":1773050183,
  "model":"Qwen/Qwen3-32B",
  "choices":[{"index":0,"delta":{"content":"Hello","reasoning_content":null},
    "finish_reason":"stop","matched_stop":151645}],
  "usage":null}
```

Every chunk has `"usage": null`, even though the proxy correctly injects `stream_options.include_usage = true` in `ValidateRequest.cpp:199-201`:

```cpp
if (stream && !has_stream_options) {
    b["stream_options"]["include_usage"] = true;
}
```

## Likely Cause

The vLLM instance serving Qwen/Qwen3-32B does not honor `stream_options.include_usage`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Usage Tracking Issue: Token Counts Always Zero #49

Problem

Root Cause

Observed SSE Response

Likely Cause

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Usage Tracking Issue: Token Counts Always Zero #49

Description

Problem

Root Cause

Observed SSE Response

Likely Cause

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions