fix(ai): fetch GitHub Copilot context window limits at runtime#2527
Closed
dpearson2699 wants to merge 1 commit into
Closed
fix(ai): fetch GitHub Copilot context window limits at runtime#2527dpearson2699 wants to merge 1 commit into
dpearson2699 wants to merge 1 commit into
Conversation
The models.dev API reports incorrect context window values for several GitHub Copilot models (e.g., claude-opus-4.6 reported as 144K instead of the actual 200K enforced by the Copilot API). The previous 1M override for Claude 4.6 models was also incorrect for the Copilot provider. Changes: - Remove github-copilot from the 1M context window override in generate-models.ts (only anthropic/opencode providers get 1M) - Add fetchCopilotModelLimits() that queries the Copilot /models API for real max_context_window_tokens and max_output_tokens per model - Call fetchCopilotModelLimits() during login and token refresh, storing limits in CopilotCredentials.modelLimits - Apply fetched limits in modifyModels() so contextWindow/maxTokens reflect the actual Copilot-enforced values after authentication - Regenerate models.generated.ts with models.dev defaults (no Copilot overrides); runtime fetch corrects values after OAuth login
Contributor
|
Hi @dpearson2699, thanks for your interest in contributing! We ask new contributors to open an issue first before submitting a PR. This helps us discuss the approach and avoid wasted effort. Next steps:
This PR will be closed automatically. See https://github.com/badlogic/pi-mono/blob/main/CONTRIBUTING.md for more details. |
|
FWIW: I asked pi to build an extension for this: https://github.com/hoesler/agent-stuff/blob/main/pi/extensions/copilot-model-limits/index.ts |
Collaborator
|
Closing because the PR author is not listed in .github/APPROVED_CONTRIBUTORS. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #2526
Problem
GitHub Copilot models have incorrect
contextWindowvalues. Two issues:Wrong 1M override:
generate-models.tsapplied a 1M context window to Claude 4.6 models for thegithub-copilotprovider. The Copilot API enforces 200K, not 1M. Result: the coding agent never compacts, eventually hitting prompt-too-long errors.Stale models.dev data: The
github-copilotsection ofmodels.dev/api.jsonreports incorrect context windows for several Claude models (e.g., claude-opus-4.6 listed as 144K, actual Copilot limit is 200K). Verified by queryingGET /modelson the Copilot API.Changes
packages/ai/scripts/generate-models.tsgithub-copilotfrom the 1M context window override (onlyanthropic,opencode,opencode-goget 1M for Claude 4.6 models)packages/ai/src/utils/oauth/github-copilot.tsfetchCopilotModelLimits(token, enterpriseDomain?)— queriesGET /modelson the Copilot API, parsescapabilities.limits.max_context_window_tokensandmax_output_tokensper modelCopilotCredentialstype withmodelLimitsfieldfetchCopilotModelLimits()duringloginGitHubCopilot()andrefreshToken(), storing results in credentialsmodifyModels()to apply fetchedcontextWindow/maxTokensfrom stored limitspackages/ai/src/models.generated.tsDesign rationale
Static hardcoded limits would need manual updates whenever GitHub adds or changes models. Fetching at runtime after OAuth login ensures limits are always accurate without maintenance burden. The fetch runs during login and token refresh (tokens expire every ~30 min), so new models or changed limits are picked up automatically. Failure to fetch is non-fatal — models.dev defaults are conservative (lower than real), so compaction triggers early but never overflows.