Attribute TokenJuice savings cost to the per-turn model, not the configured default

## Summary

TokenJuice's compaction savings dashboard (`tokenjuice.savings_stats`) prices the tokens it saves using the **configured default model** (`config.default_model`), not the model the tool result is actually being compressed *for* on that turn. Attribute savings to the live per-turn model so the cost figure is accurate for sessions that use model overrides, per-agent models, or team lead/worker model splits.

## Problem / Context

The content router compacts a tool result inside the agent loop and records the savings via `savings::record(...)`, which calls `cost_saved_usd(model, ...)` with a process-global attribution model installed once at startup (`tokenjuice::savings::configure` from `config.default_model`).

The active per-turn model **is** available deeper in the harness (`run_turn_engine(..., model: &str, ...)`), but it is not threaded down to the compaction call sites (`agent/harness/session/agent_tool_exec.rs`, `agent/harness/engine/tools.rs`), and `AgentToolExecCtx` does not currently carry it. Threading it through both call sites + the turn engine was deferred to keep the initial savings feature small.

Impact today: cost-saved is correct when a session uses the default model, but skewed when:
- a per-turn model override is in effect,
- agents run on different models (lead vs worker, per-agent `model`),
- the result is destined for a cheaper/more-expensive tier than the default.

Token counts (the dominant metric) are unaffected — only the USD figure and the `byModel` breakdown attribution are.

## Scope (optional)

In scope:
- Thread the active model id from `run_turn_engine` to the tool-execution compaction call sites (extend `AgentToolExecCtx` and the `compact_output` / `compact_tool_output` signatures with an optional model).
- Pass it into `savings::record` so `cost_saved_usd` and the `byModel` bucket use the per-turn model; fall back to the configured default when unknown.

Out of scope:
- Output-token cost modeling / re-send amplification (a tool result re-enters context on every subsequent turn) — current model counts a single input occurrence, which is the conservative estimate.

## Acceptance criteria

- [ ] **Per-turn attribution** — savings recorded during a turn are priced with that turn's model, not the global default.
- [ ] **Graceful fallback** — when the per-turn model is unavailable, attribution falls back to `config.default_model` (current behavior), no panics.
- [ ] **byModel breakdown** — `tokenjuice.savings_stats` `byModel` reflects the real mix of models across a multi-model session.
- [ ] **Diff coverage ≥ 80%** — the implementing PR meets the changed-lines coverage gate (Vitest + cargo-llvm-cov, enforced by [.github/workflows/pr-ci.yml](../../.github/workflows/pr-ci.yml)).

## Related

- Follow-up to the TokenJuice content-router / savings work (PR in flight on branch \`feat/tokenjuice-content-router\`).
- Code: \`src/openhuman/tokenjuice/savings.rs\`, \`src/openhuman/tokenjuice/compress.rs\`, \`src/openhuman/agent/harness/session/agent_tool_exec.rs\`, \`src/openhuman/agent/harness/engine/tools.rs\`, \`src/openhuman/agent/harness/engine/core.rs\`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Attribute TokenJuice savings cost to the per-turn model, not the configured default #4122

Summary

Problem / Context

Scope (optional)

Acceptance criteria

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Attribute TokenJuice savings cost to the per-turn model, not the configured default #4122

Description

Summary

Problem / Context

Scope (optional)

Acceptance criteria

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions