Skip to content

feat: respect provider context_budget_cap for cost-optimal compaction#8

Open
sadlilas wants to merge 1 commit intomicrosoft:mainfrom
sadlilas:feat/provider-budget-cap
Open

feat: respect provider context_budget_cap for cost-optimal compaction#8
sadlilas wants to merge 1 commit intomicrosoft:mainfrom
sadlilas:feat/provider-budget-cap

Conversation

@sadlilas
Copy link

Summary

Reads context_budget_cap from ProviderInfo.defaults and caps the compaction budget at that threshold. This lets providers with tiered pricing (Anthropic, Gemini) signal their cost-optimal context limit without affecting providers with flat pricing (OpenAI, Ollama).

Why

The previous approach (PR #5, fraction-based) applied a blanket budget = context_window * 0.15 to ALL providers when the context window exceeded 200k. This penalized OpenAI (flat pricing, 400k window) for an Anthropic/Gemini-specific pricing concern:

Provider + Model context_window Fraction budget (PR #5) Provider-cap budget (this PR)
Anthropic 1M 1,000,000 ~146k ~196k (capped)
Gemini 1,048,576 ~153k ~196k (capped)
OpenAI GPT-5 400,000 ~56k (terrible) ~364k (full window)
Copilot gpt-5.1 264,000 ~36k (terrible) ~228k (full window)

What changes

In _calculate_budget(), after computing the raw budget from context_window - reserved_output - safety_margin:

budget_cap = defaults.get("context_budget_cap")
if budget_cap is not None:
    budget = min(budget, budget_cap - safety_margin)

Applied in both code paths:

  1. get_model_info() path -- reads cap from provider.get_info().defaults
  2. get_info().defaults path -- reads cap from the same defaults dict already in scope

Providers that don't set context_budget_cap are unaffected -- the standard formula applies unchanged.

Design

This is the consumer side of the provider-driven budget cap. See amplifier-support#57 analysis for the full rationale.

  • "Follow the Data" -- pricing knowledge originates in the provider and flows to the consumer
  • "Compose Before You Create" -- uses the existing defaults dict, no new protocols or kernel fields
  • Zero kernel changes -- convention in modules, not kernel contract

Related PRs

  • provider-anthropic#27: Emits context_budget_cap: 200_000 when 1M context is active
  • provider-gemini: Emits context_budget_cap: 200_000 (all models have the pricing cliff)

Supersedes

Related issues

  • microsoft-amplifier/amplifier-support#57

🤖 Generated with Amplifier

Read context_budget_cap from provider defaults and cap the computed
budget. Providers with pricing cliffs (Anthropic, Gemini) set this to
keep sessions in the standard pricing zone while preserving the full
context window as a safety net. Providers without pricing cliffs
(OpenAI) don't set the key and are unaffected.

Applied in both the get_model_info() and get_info().defaults code paths.

Replaces the fraction-based approach (PR microsoft#5) which applied uniformly to
all providers regardless of their pricing model.

Related: microsoft-amplifier/amplifier-support#57

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant