Skip to content

Commit f1f909d

Browse files
committed
fix: always populate mot.usage in HuggingFace backend (#694)
Token count extraction in _post_process_async was gated behind `span is not None or metrics_enabled`, so mot.usage was never populated in plain (non-telemetry) runs. Now extracted unconditionally — usage is a standard mot field, not a telemetry concern.
1 parent 62016fb commit f1f909d

1 file changed

Lines changed: 3 additions & 7 deletions

File tree

mellea/backends/huggingface.py

Lines changed: 3 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1133,16 +1133,12 @@ class used during generation, if any.
11331133
)
11341134

11351135
span = mot._meta.get("_telemetry_span")
1136-
from ..telemetry.metrics import is_metrics_enabled
11371136

1138-
metrics_enabled = is_metrics_enabled()
1139-
1140-
# Extract token counts only if needed
1137+
# Extract token counts from the HF output sequences.
1138+
# Always computed (usage is a standard mot field, not a telemetry concern).
11411139
hf_output = mot._meta.get("hf_output")
11421140
n_prompt, n_completion = None, None
1143-
if (span is not None or metrics_enabled) and isinstance(
1144-
hf_output, GenerateDecoderOnlyOutput
1145-
):
1141+
if isinstance(hf_output, GenerateDecoderOnlyOutput):
11461142
# HuggingFace local models don't provide usage objects, but we can
11471143
# calculate token counts from sequences
11481144
try:

0 commit comments

Comments
 (0)