feat: respect provider context_budget_cap for cost-optimal compaction#8
Open
sadlilas wants to merge 1 commit intomicrosoft:mainfrom
Open
feat: respect provider context_budget_cap for cost-optimal compaction#8sadlilas wants to merge 1 commit intomicrosoft:mainfrom
sadlilas wants to merge 1 commit intomicrosoft:mainfrom
Conversation
Read context_budget_cap from provider defaults and cap the computed budget. Providers with pricing cliffs (Anthropic, Gemini) set this to keep sessions in the standard pricing zone while preserving the full context window as a safety net. Providers without pricing cliffs (OpenAI) don't set the key and are unaffected. Applied in both the get_model_info() and get_info().defaults code paths. Replaces the fraction-based approach (PR microsoft#5) which applied uniformly to all providers regardless of their pricing model. Related: microsoft-amplifier/amplifier-support#57 🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier) Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Reads
context_budget_capfromProviderInfo.defaultsand caps the compaction budget at that threshold. This lets providers with tiered pricing (Anthropic, Gemini) signal their cost-optimal context limit without affecting providers with flat pricing (OpenAI, Ollama).Why
The previous approach (PR #5, fraction-based) applied a blanket
budget = context_window * 0.15to ALL providers when the context window exceeded 200k. This penalized OpenAI (flat pricing, 400k window) for an Anthropic/Gemini-specific pricing concern:What changes
In
_calculate_budget(), after computing the raw budget fromcontext_window - reserved_output - safety_margin:Applied in both code paths:
get_model_info()path -- reads cap fromprovider.get_info().defaultsget_info().defaultspath -- reads cap from the samedefaultsdict already in scopeProviders that don't set
context_budget_capare unaffected -- the standard formula applies unchanged.Design
This is the consumer side of the provider-driven budget cap. See amplifier-support#57 analysis for the full rationale.
defaultsdict, no new protocols or kernel fieldsRelated PRs
context_budget_cap: 200_000when 1M context is activecontext_budget_cap: 200_000(all models have the pricing cliff)Supersedes
Related issues
🤖 Generated with Amplifier