cli: /usage (alias /cost) REPL command#29
Conversation
Summarizes the UsageRecord data on a conversation's model responses: model-call count, input/output tokens (with cached + cache-write breakdown), cumulative USD cost, and cumulative model wall-time. Reads straight from the event log via get_events, so it reflects everything persisted for the conversation. When some calls have no cost (model absent from the LiteLLM price table), they're counted but excluded from the dollar total, and a note reports the partial count rather than silently undercounting. Aggregation and formatting are pure functions (summarize_usage, render_usage, with_commas) with unit tests covering multi-call totals, the unpriced-call note, the empty conversation, and thousands grouping. Stacked on the per-message cost-tracking PR (adds the UsageRecord this reads); also re-exports UsageRecord from executor for the CLI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| /// `cost_usd` — so a partial total (some models missing from the price | ||
| /// table) can be reported honestly rather than silently undercounting. | ||
| #[derive(Debug, Default, PartialEq)] | ||
| struct UsageSummary { |
There was a problem hiding this comment.
why not just reuse the const struct from Lingua? The only difference appears to be the priced_calls counter which seems like it could be derived, or you could make a composite that embeds the lingua struct?
Pushing on this a bit because usage definitions are changing rapidly and getting more complicated (Eg anthropic recently introduced tiers of cached tokens)
There was a problem hiding this comment.
yup, that makes sense, can just embed it here + add costs. done.
Alexsun1one
left a comment
There was a problem hiding this comment.
clean read-side feature, the priced_calls / partial-total note is a nice touch for honesty. one small UX thing inline on the cost formatting.
| format!(" input : {input}"), | ||
| format!(" output : {output}"), | ||
| format!(" cost : ${:.4}", summary.cost_usd), | ||
| format!( |
There was a problem hiding this comment.
${:.4} looks fine for the example values but rounds small costs to misleading shapes:
- a cached call ending in $0.00004 prints
cost : $0.0000, which reads as free - a $0.00005001 call prints
$0.0001, off by ~2x in display terms
few options if it's worth caring about:
- dynamic precision: 4 digits when >= $0.001, 6 digits below
- swap the unit when total < $0.01 (e.g.
42.5 ¢or5,432 micro-USD) - print exact
{:e}for very small totals
not blocking, just flagging because the partial-total note already shows that getting cost honest matters here.
There was a problem hiding this comment.
thanks, addresse
…sion Review feedback on #29: - UsageSummary now embeds lingua's UniversalUsage for the token totals instead of re-declaring the five token fields, so the field list stays sourced from one place as usage definitions evolve (ankrgyl). calls / priced_calls / cost_usd / duration_ms remain siblings — cost is ours, not lingua's. summarize_usage folds via a small add_tokens helper. - format_cost replaces `${:.4}`, which rounded sub-$0.0001 totals to a misleading `$0.0000` (Alexsun1one). Now: 4 decimals at/above $0.0001; below that, extend precision just enough to show the first significant digit (e.g. $0.00000724 -> $0.000007). Tested against the example values.
|
updated to address both comments |
Summary
Adds a
/usagecommand (alias/cost) to the chat REPL so you can see, at any point in a conversation, what it has cost so far. This is the demo surface for the per-message cost tracking in #16.Example output:
If some calls used a model not in the LiteLLM price table, their tokens still count but they're excluded from the dollar total, with an honest note:
How it works
/usage//costbranch in the REPL input loop (tui.rs), alongside the existing/history,/quit,/exit.print_usagereads the conversation'smessagesevents viaget_eventsand folds theUsageRecordon each — no new persistence, it's pure read-side.summarize_usage,render_usage,with_commas) so they're unit-testable without a live model.priced_callsis tracked separately fromcallsso the partial-total case is reported rather than silently undercounted.Stacking
Stacked on #16 (
feature/message-cost-tracking) — this reads theUsageRecordthat #16 introduces, so its base is that branch, notmain. Will retarget tomainonce #16 merges. Also adds a one-line re-export ofUsageRecordfromexecutorfor the CLI.Test plan
cargo test --workspace— +4 unit tests intui::tests:summarize_usage_aggregates_priced_calls_and_skips_usageless_events(user-input events without usage are ignored; totals sum correctly)summarize_usage_counts_unpriced_calls_separately(None cost → counted but not priced; partial-total note rendered)render_usage_reports_empty_conversationwith_commas_groups_thousandscargo fmt --all -- --checkcargo clippy --workspace --all-targets -- -D warningsNot in scope
/usagefollows the same undocumented-but-discoverable convention as/history).🤖 Generated with Claude Code