Skip to content

cli: /usage (alias /cost) REPL command#29

Open
akrentsel wants to merge 2 commits into
feature/message-cost-trackingfrom
usage-command
Open

cli: /usage (alias /cost) REPL command#29
akrentsel wants to merge 2 commits into
feature/message-cost-trackingfrom
usage-command

Conversation

@akrentsel
Copy link
Copy Markdown
Collaborator

Summary

Adds a /usage command (alias /cost) to the chat REPL so you can see, at any point in a conversation, what it has cost so far. This is the demo surface for the per-message cost tracking in #16.

Example output:

usage (this conversation):
  model calls : 3
  input       : 14,500 tokens (1,200 cached, 800 cache-write)
  output      : 2,100 tokens (450 reasoning)
  cost        : $0.0182
  model time  : 8.3s

If some calls used a model not in the LiteLLM price table, their tokens still count but they're excluded from the dollar total, with an honest note:

  note        : 1 of 3 call(s) had no price in the table; cost is a partial total

How it works

  • New /usage / /cost branch in the REPL input loop (tui.rs), alongside the existing /history, /quit, /exit.
  • print_usage reads the conversation's messages events via get_events and folds the UsageRecord on each — no new persistence, it's pure read-side.
  • Aggregation/formatting split into pure functions (summarize_usage, render_usage, with_commas) so they're unit-testable without a live model.
  • priced_calls is tracked separately from calls so the partial-total case is reported rather than silently undercounted.

Stacking

Stacked on #16 (feature/message-cost-tracking) — this reads the UsageRecord that #16 introduces, so its base is that branch, not main. Will retarget to main once #16 merges. Also adds a one-line re-export of UsageRecord from executor for the CLI.

Test plan

  • cargo test --workspace — +4 unit tests in tui::tests:
    • summarize_usage_aggregates_priced_calls_and_skips_usageless_events (user-input events without usage are ignored; totals sum correctly)
    • summarize_usage_counts_unpriced_calls_separately (None cost → counted but not priced; partial-total note rendered)
    • render_usage_reports_empty_conversation
    • with_commas_groups_thousands
  • cargo fmt --all -- --check
  • cargo clippy --workspace --all-targets -- -D warnings

Not in scope

  • Per-turn breakdown / live inline display after each turn (this is a cumulative on-demand summary).
  • A REPL command help banner (none exists today; /usage follows the same undocumented-but-discoverable convention as /history).

🤖 Generated with Claude Code

Summarizes the UsageRecord data on a conversation's model responses:
model-call count, input/output tokens (with cached + cache-write
breakdown), cumulative USD cost, and cumulative model wall-time. Reads
straight from the event log via get_events, so it reflects everything
persisted for the conversation.

When some calls have no cost (model absent from the LiteLLM price table),
they're counted but excluded from the dollar total, and a note reports
the partial count rather than silently undercounting.

Aggregation and formatting are pure functions (summarize_usage,
render_usage, with_commas) with unit tests covering multi-call totals,
the unpriced-call note, the empty conversation, and thousands grouping.

Stacked on the per-message cost-tracking PR (adds the UsageRecord this
reads); also re-exports UsageRecord from executor for the CLI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@akrentsel akrentsel marked this pull request as ready for review May 27, 2026 17:33
Comment thread crates/cli/src/tui.rs
/// `cost_usd` — so a partial total (some models missing from the price
/// table) can be reported honestly rather than silently undercounting.
#[derive(Debug, Default, PartialEq)]
struct UsageSummary {
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just reuse the const struct from Lingua? The only difference appears to be the priced_calls counter which seems like it could be derived, or you could make a composite that embeds the lingua struct?

Pushing on this a bit because usage definitions are changing rapidly and getting more complicated (Eg anthropic recently introduced tiers of cached tokens)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, that makes sense, can just embed it here + add costs. done.

Copy link
Copy Markdown
Contributor

@Alexsun1one Alexsun1one left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clean read-side feature, the priced_calls / partial-total note is a nice touch for honesty. one small UX thing inline on the cost formatting.

Comment thread crates/cli/src/tui.rs
format!(" input : {input}"),
format!(" output : {output}"),
format!(" cost : ${:.4}", summary.cost_usd),
format!(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

${:.4} looks fine for the example values but rounds small costs to misleading shapes:

  • a cached call ending in $0.00004 prints cost : $0.0000, which reads as free
  • a $0.00005001 call prints $0.0001, off by ~2x in display terms

few options if it's worth caring about:

  • dynamic precision: 4 digits when >= $0.001, 6 digits below
  • swap the unit when total < $0.01 (e.g. 42.5 ¢ or 5,432 micro-USD)
  • print exact {:e} for very small totals

not blocking, just flagging because the partial-total note already shows that getting cost honest matters here.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, addresse

…sion

Review feedback on #29:

- UsageSummary now embeds lingua's UniversalUsage for the token totals
  instead of re-declaring the five token fields, so the field list stays
  sourced from one place as usage definitions evolve (ankrgyl). calls /
  priced_calls / cost_usd / duration_ms remain siblings — cost is ours, not
  lingua's. summarize_usage folds via a small add_tokens helper.

- format_cost replaces `${:.4}`, which rounded sub-$0.0001 totals to a
  misleading `$0.0000` (Alexsun1one). Now: 4 decimals at/above $0.0001;
  below that, extend precision just enough to show the first significant
  digit (e.g. $0.00000724 -> $0.000007). Tested against the example values.
@akrentsel akrentsel requested a review from ankrgyl May 28, 2026 05:27
@akrentsel
Copy link
Copy Markdown
Collaborator Author

updated to address both comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants