fix(session): filter Anthropic-only params from 3P provider requests (#248)#1533
Conversation
📝 WalkthroughWalkthroughGates Anthropic-specific betas and compaction cache-sharing to Anthropic-capable providers and strips ChangesProvider-Aware Parameter Gating
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 6 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (6 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
120b188 to
c45e415
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/services/compact/compact.ts`:
- Around line 1158-1167: compactConversation() is still reading
promptCacheSharingEnabled directly from the raw tengu_compact_cache_prefix flag;
update it to use the same provider gate as streamCompactSummary() by computing
cacheSharingAvailable = isAnthropicProvider() and then setting
promptCacheSharingEnabled = cacheSharingAvailable &&
getFeatureValue_CACHED_MAY_BE_STALE('tengu_compact_cache_prefix', true) so
telemetry reflects the new provider gate (refer to isAnthropicProvider(),
promptCacheSharingEnabled, and getFeatureValue_CACHED_MAY_BE_STALE in
compactConversation()).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 35d34dc1-13ee-477d-886c-ca332155e8aa
📒 Files selected for processing (3)
src/services/api/openaiShim.tssrc/services/compact/compact.tssrc/utils/betas.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: smoke-and-tests
🧰 Additional context used
📓 Path-based instructions (2)
**/*
⚙️ CodeRabbit configuration file
**/*: Apply the OpenClaude maintainer review rubric from AGENTS.md. Review the current diff, not stale discussion context. Separate real blockers from suggestions. Do not request changes for vague style churn. Treat approval as merge-ready from CodeRabbit's side, pending required human review and GitHub Checks. If checks are failing or unavailable, say so clearly instead of implying the PR is fully ready.
Files:
src/services/api/openaiShim.tssrc/utils/betas.tssrc/services/compact/compact.ts
{src/services/api/**,src/integrations/**,src/utils/model/**,src/utils/provider*.ts,src/commands/provider/**}
⚙️ CodeRabbit configuration file
{src/services/api/**,src/integrations/**,src/utils/model/**,src/utils/provider*.ts,src/commands/provider/**}: Review provider routing, model selection, env precedence, auth/token handling, OpenAI-compatible shims, retries, proxy behavior, and outbound HTTP behavior with high scrutiny. Block on silent default changes, hidden fallback expansion, credential reuse mistakes, hardcoded provider assumptions, or new network reach that is not intentional and documented.
Files:
src/services/api/openaiShim.ts
jatmn
left a comment
There was a problem hiding this comment.
I found one issue that needs to be addressed before this is ready.
Findings
- [P3] Complete CodeRabbit's request to gate compaction telemetry
src/services/compact/compact.ts:438
CodeRabbit's review thread is still unresolved, and the current patch still computescompactConversation()'spromptCacheSharingEnabledfrom the rawtengu_compact_cache_prefixflag. For non-Anthropic providers,streamCompactSummary()now correctly disables the cache-sharing path withisAnthropicProvider(), but the latertengu_compact*events still report cache sharing as enabled whenever the raw flag is true. That makes the rollout/fallback/failure telemetry disagree with the behavior this PR is introducing. Please complete CodeRabbit's request by applying the same provider gate to the value recorded bycompactConversation().
jatmn
left a comment
There was a problem hiding this comment.
please roll back your prior changes and rebase on main.
please do not submit the entire codebase as new.
|
@jatmn my bad for the inconvenience. The diff noise (+660k/-660k across all files) is from Windows git's core.autocrlf=true converting LF↔CRLF on checkout — every line registers as changed. Working on a clean rebase now. |
49f20ba to
83561c7
Compare
jatmn
left a comment
There was a problem hiding this comment.
I found an issue that needs to be addressed before this is ready.
Findings
- [P2] Add regression coverage for the new provider gates
src/utils/betas.ts:409
This PR fixes provider-specific request shaping in three sensitive paths, but the patch does not add tests that would fail if any of the new gates regressed. The existing suite covers generic provider detection and somethinkingstripping, but I do not see coverage forgetMergedBetas()returning no Anthropic beta headers on non-Anthropic providers,compactConversation()/streamCompactSummary()skipping the cache-sharing fork for non-Anthropic providers, or assistantredacted_thinkingblocks being removed before OpenAI-compatible replay. Since these are the exact issue #248 failure modes that can reintroduce 400s or corrupted resumed context, please add focused regression tests for the new provider-gated behavior before merging.
- betas.test.ts: add isAnthropicProvider tests for mistral, xai, minimax - compact.test.ts: verify compactConversation skips forked-agent cache-sharing for non-Anthropic providers and uses it for Anthropic These tests pin the provider gates added in PR Gitlawb#1533: - getMergedBetas() returns [] for non-Anthropic (betas.test.ts) - compactConversation()/streamCompactSummary() skip cache-sharing when isAnthropicProvider() returns false (compact.test.ts)
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/services/compact/compact.test.ts`:
- Around line 132-460: The importCompact test helper is too large and brittle
(it mocks 40+ modules), so refactor by extracting the bulk of the mock setup
into a shared fixture/helper and annotating which mocks are essential vs
defensive; create a new helper (used by importCompact) that registers the common
stubs and returns overridable hooks (e.g., runForkedAgent) so other compact
tests can reuse the same baseline, and add a short top-of-file comment in
importCompact listing which specific mocks are required for the provider gate
test versus defensive fall-through stubs to reduce coupling to compact.ts's
exact imports.
- Around line 481-501: The tests currently wrap compactConversation in try/catch
which masks post-compaction failures and only assert whether runForkedAgent was
called; update the tests for both the Anthropic and non-Anthropic cases so they
stub/mock any downstream helpers (so compactConversation can complete without
throwing) and then assert call behavior and arguments: for the gate-open test
verify runForkedAgent was called with the expected cacheSafeParams and
querySource: 'compact' (and other expected args), and for the gate-closed test
verify runForkedAgent was not called and that the streaming fallback
(queryModelWithStreaming) was invoked; reference compactConversation,
runForkedAgent, cacheSafeParams and queryModelWithStreaming when locating the
code to change.
- Line 62: Tests use blanket "as never" assertions which disable TypeScript
checking; replace them by declaring minimal fixture types that match the actual
function signatures (e.g., create a MinimalCompactOptions or
MinimalRequest/Response interface matching the parameters expected by the
compact function or CompactService methods used in these tests) and type the
test fixtures with those interfaces (or Partial<ActualType> & Pick<ActualType,
'propNeeded'>) instead of "as never"; update the fixtures referenced in the test
(the objects currently cast with "as never") to include only the required
properties and use explicit typing so the compiler can catch mismatches.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 137d19fa-30db-48ff-b546-1b63e57a4f13
📒 Files selected for processing (2)
src/services/compact/compact.test.tssrc/utils/betas.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: smoke-and-tests
🧰 Additional context used
📓 Path-based instructions (6)
**/*.test.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Add or update tests when the change affects behavior
Files:
src/utils/betas.test.tssrc/services/compact/compact.test.ts
**/*.{ts,tsx,js,jsx,py}
📄 CodeRabbit inference engine (CONTRIBUTING.md)
**/*.{ts,tsx,js,jsx,py}: Follow the existing code style in the touched files
Keep comments useful and concise
Files:
src/utils/betas.test.tssrc/services/compact/compact.test.ts
**/*.{ts,tsx}
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Typecheck TypeScript code before submitting
Files:
src/utils/betas.test.tssrc/services/compact/compact.test.ts
**/*
⚙️ CodeRabbit configuration file
**/*: Apply the OpenClaude maintainer review rubric from AGENTS.md. Review the current diff, not stale discussion context. Separate real blockers from suggestions. Do not request changes for vague style churn. Treat approval as merge-ready from CodeRabbit's side, pending required human review and GitHub Checks. If checks are failing or unavailable, say so clearly instead of implying the PR is fully ready.
Files:
src/utils/betas.test.tssrc/services/compact/compact.test.ts
{src/**/*.test.ts,src/**/*.test.tsx,tests/**,scripts/**/*.test.ts,vscode-extension/**/*.test.js}
⚙️ CodeRabbit configuration file
{src/**/*.test.ts,src/**/*.test.tsx,tests/**,scripts/**/*.test.ts,vscode-extension/**/*.test.js}: Review tests for meaningful coverage of the changed behavior, isolation of global/env/config state, async cleanup, fake timers, provider profile leaks, and Windows-compatible assumptions. Block when risky runtime changes lack focused regression coverage or tests assert implementation details while missing the user-visible behavior.
Files:
src/utils/betas.test.tssrc/services/compact/compact.test.ts
**
⚙️ CodeRabbit configuration file
**: # Contributing to OpenClaudeThanks for contributing.
OpenClaude is a fast-moving open-source coding-agent CLI with support for multiple providers, local backends, MCP, and a terminal-first workflow. The best contributions here are focused, well-tested, and easy to review.
Before You Start
- Search existing issues and discussions before opening a new thread.
- Check open pull requests for work that overlaps with your contribution. If a PR already exists that addresses the same change, open an issue or discussion first to align on direction — duplicate PRs may be closed without review.
- Use issues for confirmed bugs and actionable feature work.
- Use discussions for setup help, ideas, and general community conversation.
- For larger changes, open an issue first so the scope is clear before implementation.
- For security reports, follow SECURITY.md.
Pull Requests
Every PR needs a reason. Your PR description must include:
- what changed and why
- the user or developer impact
- the exact checks you ran
- a linked issue when one exists, using
Fixes#123, `Closes `#123, or another clear link- screenshots when the PR touches UI, terminal presentation, or the VS Code extension
- which provider path was tested when the PR changes provider behavior
The PR author is responsible for ensuring their PR is merge-ready. PRs with merge conflicts will not be reviewed or approved until the conflicts are resolved.
Issues are the recommended starting point for anything non-trivial — opening one first helps avoid wasted effort if the change is out of scope or already being worked on. Small fixes, doc corrections, and obvious improvements can stand on their own without a linked issue, as long as the PR description explains the intent.
What Gets Closed Without Review
PRs may be closed without review...
Files:
src/utils/betas.test.tssrc/services/compact/compact.test.ts
🔇 Additional comments (1)
src/utils/betas.test.ts (1)
1-182: LGTM!
jatmn
left a comment
There was a problem hiding this comment.
I found two issues that need to be addressed before this is ready.
Findings
-
[P1] Complete CodeRabbit's request to isolate the compact test mocks
src/services/compact/compact.test.ts:132
CodeRabbit's review item about the oversizedimportCompact()mock harness is still valid, and the current patch now proves the risk:bun run checkreachestest:fulland then fails dozens of unrelated tests after this file runs. The helper registers broadmock.module()replacements for shared modules like../../utils/betas.js,../../utils/model/providers.js, and../../utils/envUtils.js; later tests then observe stale mocked provider/beta behavior even though their own env setup is correct. Please complete that review request by replacing this broad global mock harness with isolated, minimal test seams or otherwise ensuring these module mocks cannot leak outside this file. -
[P2] Complete the requested redacted-thinking regression coverage
src/services/api/openaiShim.ts:588
The prior maintainer review asked for focused coverage of all three new provider-gated behaviors, including assistantredacted_thinkingblocks being stripped before OpenAI-compatible replay. The follow-up commit adds tests for beta headers and compaction cache sharing, but it still does not add a test that would fail if thisopenaiShimfilter regressed and started serializingredacted_thinkinginto outgoing assistant content again. Please complete that review request by adding a focused OpenAI-compatible shim regression for theredacted_thinkingpath.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/services/compact/compact.test.ts (1)
145-510:⚠️ Potential issue | 🟡 MinorApprove:
registerCommonCompactStubsrefactor + mock setup
betas.tsmock covers all exported members needed at runtime (no missing exports; extra keys are harmless).envUtils.tsmock is a full match for exported members.providers.ts“complete” is slightly overstated: the only “missing” items are type-only exports (APIProvider,LegacyAPIProvider), while the runtime functions (getAPIProvider,getAPIProviderForStatsig,usesAnthropicAccountFlow, etc.) are present.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/services/compact/compact.test.ts` around lines 145 - 510, The registerCommonCompactStubs refactor and its mock coverage are fine; no code changes required — keep the function as-is (registerCommonCompactStubs) with the comprehensive betas.js mock (isAnthropicProvider, getMergedBetas, etc.), full envUtils.js exports, and runtime provider mocks in providers.js (getAPIProvider, getAPIProviderForStatsig, usesAnthropicAccountFlow, isFirstPartyAnthropicBaseUrl) — nothing to modify.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/services/compact/compact.test.ts`:
- Around line 145-510: The registerCommonCompactStubs refactor and its mock
coverage are fine; no code changes required — keep the function as-is
(registerCommonCompactStubs) with the comprehensive betas.js mock
(isAnthropicProvider, getMergedBetas, etc.), full envUtils.js exports, and
runtime provider mocks in providers.js (getAPIProvider,
getAPIProviderForStatsig, usesAnthropicAccountFlow,
isFirstPartyAnthropicBaseUrl) — nothing to modify.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 85f76424-c881-4449-8045-97efa907e561
📒 Files selected for processing (2)
src/services/api/openaiShim.test.tssrc/services/compact/compact.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: smoke-and-tests
🧰 Additional context used
📓 Path-based instructions (4)
**/*
⚙️ CodeRabbit configuration file
**/*: Apply the OpenClaude maintainer review rubric from AGENTS.md. Review the current diff, not stale discussion context. Separate real blockers from suggestions. Do not request changes for vague style churn. Treat approval as merge-ready from CodeRabbit's side, pending required human review and GitHub Checks. If checks are failing or unavailable, say so clearly instead of implying the PR is fully ready.
Files:
src/services/api/openaiShim.test.tssrc/services/compact/compact.test.ts
{src/services/api/**,src/integrations/**,src/utils/model/**,src/utils/provider*.ts,src/commands/provider/**}
⚙️ CodeRabbit configuration file
{src/services/api/**,src/integrations/**,src/utils/model/**,src/utils/provider*.ts,src/commands/provider/**}: Review provider routing, model selection, env precedence, auth/token handling, OpenAI-compatible shims, retries, proxy behavior, and outbound HTTP behavior with high scrutiny. Block on silent default changes, hidden fallback expansion, credential reuse mistakes, hardcoded provider assumptions, or new network reach that is not intentional and documented.
Files:
src/services/api/openaiShim.test.ts
{src/**/*.test.ts,src/**/*.test.tsx,tests/**,scripts/**/*.test.ts,vscode-extension/**/*.test.js}
⚙️ CodeRabbit configuration file
{src/**/*.test.ts,src/**/*.test.tsx,tests/**,scripts/**/*.test.ts,vscode-extension/**/*.test.js}: Review tests for meaningful coverage of the changed behavior, isolation of global/env/config state, async cleanup, fake timers, provider profile leaks, and Windows-compatible assumptions. Block when risky runtime changes lack focused regression coverage or tests assert implementation details while missing the user-visible behavior.
Files:
src/services/api/openaiShim.test.tssrc/services/compact/compact.test.ts
**
⚙️ CodeRabbit configuration file
**: # Contributing to OpenClaudeThanks for contributing.
OpenClaude is a fast-moving open-source coding-agent CLI with support for multiple providers, local backends, MCP, and a terminal-first workflow. The best contributions here are focused, well-tested, and easy to review.
Before You Start
- Search existing issues and discussions before opening a new thread.
- Check open pull requests for work that overlaps with your contribution. If a PR already exists that addresses the same change, open an issue or discussion first to align on direction — duplicate PRs may be closed without review.
- Use issues for confirmed bugs and actionable feature work.
- Use discussions for setup help, ideas, and general community conversation.
- For larger changes, open an issue first so the scope is clear before implementation.
- For security reports, follow SECURITY.md.
Pull Requests
Every PR needs a reason. Your PR description must include:
- what changed and why
- the user or developer impact
- the exact checks you ran
- a linked issue when one exists, using
Fixes#123, `Closes `#123, or another clear link- screenshots when the PR touches UI, terminal presentation, or the VS Code extension
- which provider path was tested when the PR changes provider behavior
The PR author is responsible for ensuring their PR is merge-ready. PRs with merge conflicts will not be reviewed or approved until the conflicts are resolved.
Issues are the recommended starting point for anything non-trivial — opening one first helps avoid wasted effort if the change is out of scope or already being worked on. Small fixes, doc corrections, and obvious improvements can stand on their own without a linked issue, as long as the PR description explains the intent.
What Gets Closed Without Review
PRs may be closed without review...
Files:
src/services/api/openaiShim.test.tssrc/services/compact/compact.test.ts
🧠 Learnings (1)
📚 Learning: 2026-06-04T22:10:40.834Z
Learnt from: CR
Repo: Gitlawb/openclaude PR: 0
File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0
Timestamp: 2026-06-04T22:10:40.834Z
Learning: Verify that product, trust-model, routing-default, telemetry/network, and permission-policy changes are not hidden inside unrelated cleanup. Flag the PR if the policy decision needs explicit maintainer alignment.
Applied to files:
src/services/compact/compact.test.ts
🔇 Additional comments (3)
src/services/api/openaiShim.test.ts (1)
5099-5134: LGTM!src/services/compact/compact.test.ts (2)
555-560: ⚡ Quick winafterAll() prevents mock leakage across test files.
Excellent addition. This safety net ensures that mock.module() registrations from this file don't affect subsequent test files, which is critical for test isolation in Bun's module caching.
563-594: ⚡ Quick winSimplified tests with proper failure propagation.
The removal of try/catch wrappers (lines 574, 590) is a significant improvement over the previous version:
- Tests now properly fail if
compactConversationthrows, rather than swallowing exceptions- The comprehensive defensive mocks ensure
compactConversationcan complete without real I/O- Assertions are clearer and directly verify the gate behavior
The focused scope (call count only, not arguments or fallback path) is appropriate for this PR's regression coverage: verifying that the forked-agent cache-sharing path (which would send Anthropic-only params) is not taken for non-Anthropic providers.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/utils/betas.test.ts (1)
1-184: 🧹 Nitpick | 🔵 TrivialAdd beta coverage for NVIDIA_NIM and GitHub (non-Claude)
getMergedBetas()only allows Anthropic betas whenisAnthropicProvider()isfirstParty/bedrock/vertex/foundry; withNVIDIA_NIM=1it should return[]unlessisGithubNativeAnthropicMode(model)is true.- For GitHub, betas depend on
isGithubNativeAnthropicMode(model)(requiresCLAUDE_CODE_USE_GITHUB=1and a model string containingclaude-); add a test thatCLAUDE_CODE_USE_GITHUB=1+ a non-Claude model returns[]fromgetMergedBetas().🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/utils/betas.test.ts` around lines 1 - 184, Add tests covering NVIDIA_NIM and GitHub non-Claude behavior: update the betas.test suite to assert that getMergedBetas(MODEL) returns [] when NVIDIA_NIM= '1' (because NVIDIA_NIM should not count as an Anthropic provider) unless isGithubNativeAnthropicMode(MODEL) is true, and add a GitHub non-Claude case where process.env.CLAUDE_CODE_USE_GITHUB = '1' with a model string that does NOT contain 'claude-' to verify getMergedBetas(...) returns []; reference the existing test helpers importFreshBetas(), the getMergedBetas function, isAnthropicProvider and isGithubNativeAnthropicMode semantics, and set/clear process.env variables (NVIDIA_NIM, CLAUDE_CODE_USE_GITHUB, OPENAI_MODEL) similarly to other tests so provider detection is re-evaluated.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/services/compact/compact.test.ts`:
- Around line 160-167: The tests rely on real provider-resolution but
clearProviderEnv() only clears keys in SAVED_ENV and therefore does not remove
CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED or
CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED_ID, allowing an ambient provider
profile to leak and make assertions nondeterministic; update clearProviderEnv()
usage in this test (and the other spots noted) to also unset
CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED and
CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED_ID before each test (or add them to
SAVED_ENV), ensuring provider-profile env vars are cleared so the provider gate
assertions (isAnthropicProvider() path) run deterministically.
---
Outside diff comments:
In `@src/utils/betas.test.ts`:
- Around line 1-184: Add tests covering NVIDIA_NIM and GitHub non-Claude
behavior: update the betas.test suite to assert that getMergedBetas(MODEL)
returns [] when NVIDIA_NIM= '1' (because NVIDIA_NIM should not count as an
Anthropic provider) unless isGithubNativeAnthropicMode(MODEL) is true, and add a
GitHub non-Claude case where process.env.CLAUDE_CODE_USE_GITHUB = '1' with a
model string that does NOT contain 'claude-' to verify getMergedBetas(...)
returns []; reference the existing test helpers importFreshBetas(), the
getMergedBetas function, isAnthropicProvider and isGithubNativeAnthropicMode
semantics, and set/clear process.env variables (NVIDIA_NIM,
CLAUDE_CODE_USE_GITHUB, OPENAI_MODEL) similarly to other tests so provider
detection is re-evaluated.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 1c330e1d-facc-4534-8e0e-bd61d540310d
📒 Files selected for processing (3)
src/services/compact/autoCompact.test.tssrc/services/compact/compact.test.tssrc/utils/betas.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: smoke-and-tests
🧰 Additional context used
📓 Path-based instructions (3)
**/*
⚙️ CodeRabbit configuration file
**/*: Apply the OpenClaude maintainer review rubric from AGENTS.md. Review the current diff, not stale discussion context. Separate real blockers from suggestions. Do not request changes for vague style churn. Treat approval as merge-ready from CodeRabbit's side, pending required human review and GitHub Checks. If checks are failing or unavailable, say so clearly instead of implying the PR is fully ready.
Files:
src/services/compact/autoCompact.test.tssrc/utils/betas.test.tssrc/services/compact/compact.test.ts
{src/**/*.test.ts,src/**/*.test.tsx,tests/**,scripts/**/*.test.ts,vscode-extension/**/*.test.js}
⚙️ CodeRabbit configuration file
{src/**/*.test.ts,src/**/*.test.tsx,tests/**,scripts/**/*.test.ts,vscode-extension/**/*.test.js}: Review tests for meaningful coverage of the changed behavior, isolation of global/env/config state, async cleanup, fake timers, provider profile leaks, and Windows-compatible assumptions. Block when risky runtime changes lack focused regression coverage or tests assert implementation details while missing the user-visible behavior.
Files:
src/services/compact/autoCompact.test.tssrc/utils/betas.test.tssrc/services/compact/compact.test.ts
**
⚙️ CodeRabbit configuration file
**: # Contributing to OpenClaudeThanks for contributing.
OpenClaude is a fast-moving open-source coding-agent CLI with support for multiple providers, local backends, MCP, and a terminal-first workflow. The best contributions here are focused, well-tested, and easy to review.
Before You Start
- Search existing issues and discussions before opening a new thread.
- Check open pull requests for work that overlaps with your contribution. If a PR already exists that addresses the same change, open an issue or discussion first to align on direction — duplicate PRs may be closed without review.
- Use issues for confirmed bugs and actionable feature work.
- Use discussions for setup help, ideas, and general community conversation.
- For larger changes, open an issue first so the scope is clear before implementation.
- For security reports, follow SECURITY.md.
Pull Requests
Every PR needs a reason. Your PR description must include:
- what changed and why
- the user or developer impact
- the exact checks you ran
- a linked issue when one exists, using
Fixes#123, `Closes `#123, or another clear link- screenshots when the PR touches UI, terminal presentation, or the VS Code extension
- which provider path was tested when the PR changes provider behavior
The PR author is responsible for ensuring their PR is merge-ready. PRs with merge conflicts will not be reviewed or approved until the conflicts are resolved.
Issues are the recommended starting point for anything non-trivial — opening one first helps avoid wasted effort if the change is out of scope or already being worked on. Small fixes, doc corrections, and obvious improvements can stand on their own without a linked issue, as long as the PR description explains the intent.
What Gets Closed Without Review
PRs may be closed without review...
Files:
src/services/compact/autoCompact.test.tssrc/utils/betas.test.tssrc/services/compact/compact.test.ts
🧠 Learnings (1)
📚 Learning: 2026-06-04T22:10:40.834Z
Learnt from: CR
Repo: Gitlawb/openclaude PR: 0
File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0
Timestamp: 2026-06-04T22:10:40.834Z
Learning: Verify that product, trust-model, routing-default, telemetry/network, and permission-policy changes are not hidden inside unrelated cleanup. Flag the PR if the policy decision needs explicit maintainer alignment.
Applied to files:
src/services/compact/compact.test.ts
🔇 Additional comments (6)
src/utils/betas.test.ts (6)
56-68: LGTM!
70-76: LGTM!
80-90: LGTM!
94-116: LGTM!
118-128: LGTM!
132-183: LGTM!
jatmn
left a comment
There was a problem hiding this comment.
I found an issue that needs to be addressed before this is ready.
Findings
- [P1] Restore provider env isolation in the new gate tests
src/utils/betas.test.ts:14
The new provider-gate tests do not clearCLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED/CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED_ID, andsrc/services/compact/compact.test.tshas the same gap in itsSAVED_ENVcleanup at line 80. That makes the tests' "all provider env cleared means firstParty" assumption depend on whatever provider-profile state earlier tests left inprocess.env;resolveActiveRouteIdFromEnv()explicitly consults that profile state. This is already breaking the PR smoke path: the currentsmoke-and-testsjob failstest:fullwith provider-gate assertions inbetas.test.tsand the compact Anthropic gate, and I reproduced the full-suite failure locally. Please complete CodeRabbit's still-valid request forcompact.test.tsand apply the same cleanup tobetas.test.tsso these tests snapshot, clear, and restore the provider-profile env flags before asserting first-party/non-Anthropic behavior.
- betas.test.ts: add isAnthropicProvider tests for mistral, xai, minimax - compact.test.ts: verify compactConversation skips forked-agent cache-sharing for non-Anthropic providers and uses it for Anthropic These tests pin the provider gates added in PR Gitlawb#1533: - getMergedBetas() returns [] for non-Anthropic (betas.test.ts) - compactConversation()/streamCompactSummary() skip cache-sharing when isAnthropicProvider() returns false (compact.test.ts)
d35c13c to
9e46b19
Compare
resolveEnvOnlyProviderRouteId (src/integrations/routeMetadata.ts:428) returns
'venice' or 'xiaomi-mimo' when those env vars are set, even when the test
sets CLAUDE_CODE_USE_OPENAI=1 — its 'env-only' check runs BEFORE the
USE_OPENAI branch (L658). Without clearing these, leaked values from
earlier tests in the same process cause isAnthropicProvider() to return
the wrong value, so the new provider-gate assertions fail:
getMergedBetas returns [] for the openai provider
expected: []
received: [claude-code-20250219, interleaved-thinking-2025-05-14,
context-management-2025-06-27, prompt-caching-scope-2026-01-05]
isAnthropicProvider is false for the openai/gemini/mistral/xai/minimax
expected: false received: true
compactConversation provider gate > uses forked-agent cache-sharing
for Anthropic providers
expected: runForkedAgent called received: 0 calls
The previous 'CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED[_ID]' fix was
insufficient because those vars don't gate the env-only routes directly;
VENICE_API_KEY / MIMO_API_KEY do. Add them to the per-test cleanup in
both betas.test.ts (PROVIDER_ENV_KEYS) and compact.test.ts (SAVED_ENV).
Skipped: pre-existing CI failures unrelated to this PR's provider gate
(autoCompactIfNeeded circuit breaker × 7, getEffectiveContextWindowSize
on MiniMax M2, microCompact MCP, /export direct filename, getProjectMemoryPathForSelector).
These fail on origin/main with the same pre-PR failures and are out of
scope for Gitlawb#1533.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/services/compact/compact.ts (1)
434-439: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick winUpdate the stale “3P default: true” comment.
This branch is now gated by
isAnthropicProvider(), so the existing wording is no longer accurate and contradicts the new precondition immediately below. Tighten the comment here so it describes “enabled only for Anthropic-capable providers when the flag is on,” and the laterstreamCompactSummary()comment can keep pointing back to it without reintroducing the old assumption.As per coding guidelines, "Keep comments useful and concise."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/services/compact/compact.ts` around lines 434 - 439, Update the stale block comment to reflect that the 3P cache-sharing behavior is now enabled only for Anthropic-capable providers when the feature flag is on: remove the "3P default: true" assertion and instead state that cache-sharing is gated by isAnthropicProvider() and the GB flag (i.e., only Anthropic providers share prompt cache), keep the experiment summary and the kill-switch note, and mention that non-Anthropic providers remain incompatible; ensure streamCompactSummary() can still reference this updated comment.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/utils/betas.test.ts`:
- Around line 117-127: Add a sibling test that covers the GitHub non-Claude
regression: when process.env.CLAUDE_CODE_USE_GITHUB = '1' and ANTHROPIC_BASE_URL
/ ANTHROPIC_API_KEY are set but OPENAI_MODEL is a non-Claude model, assert
getMergedBetas(MODEL) returns an empty array; locate the existing test using
getMergedBetas, importFreshBetas, and MODEL and duplicate it with a non-Claude
model value to ensure isGithubNativeAnthropicMode() only enables betas for
Claude models.
---
Outside diff comments:
In `@src/services/compact/compact.ts`:
- Around line 434-439: Update the stale block comment to reflect that the 3P
cache-sharing behavior is now enabled only for Anthropic-capable providers when
the feature flag is on: remove the "3P default: true" assertion and instead
state that cache-sharing is gated by isAnthropicProvider() and the GB flag
(i.e., only Anthropic providers share prompt cache), keep the experiment summary
and the kill-switch note, and mention that non-Anthropic providers remain
incompatible; ensure streamCompactSummary() can still reference this updated
comment.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: b676720b-70f7-454c-a890-cd215a1ec935
📒 Files selected for processing (7)
src/services/api/openaiShim.test.tssrc/services/api/openaiShim.tssrc/services/compact/autoCompact.test.tssrc/services/compact/compact.test.tssrc/services/compact/compact.tssrc/utils/betas.test.tssrc/utils/betas.ts
💤 Files with no reviewable changes (1)
- src/services/compact/autoCompact.test.ts
📜 Review details
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{ts,tsx,js,jsx,py}
📄 CodeRabbit inference engine (CONTRIBUTING.md)
**/*.{ts,tsx,js,jsx,py}: Follow the existing code style in the touched files
Keep comments useful and concise
Files:
src/services/compact/compact.tssrc/services/api/openaiShim.test.tssrc/services/api/openaiShim.tssrc/utils/betas.test.tssrc/utils/betas.tssrc/services/compact/compact.test.ts
**/*.{ts,tsx}
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Typecheck TypeScript code before submitting (use
bun run typecheck)
Files:
src/services/compact/compact.tssrc/services/api/openaiShim.test.tssrc/services/api/openaiShim.tssrc/utils/betas.test.tssrc/utils/betas.tssrc/services/compact/compact.test.ts
**/*
⚙️ CodeRabbit configuration file
**/*: Apply the OpenClaude maintainer review rubric from AGENTS.md. Review the current diff, not stale discussion context. Separate real blockers from suggestions. Do not request changes for vague style churn. Treat approval as merge-ready from CodeRabbit's side, pending required human review and GitHub Checks. If checks are failing or unavailable, say so clearly instead of implying the PR is fully ready.
Files:
src/services/compact/compact.tssrc/services/api/openaiShim.test.tssrc/services/api/openaiShim.tssrc/utils/betas.test.tssrc/utils/betas.tssrc/services/compact/compact.test.ts
**
⚙️ CodeRabbit configuration file
**: # Contributing to OpenClaudeThanks for contributing.
OpenClaude is a fast-moving open-source coding-agent CLI with support for multiple providers, local backends, MCP, and a terminal-first workflow. The best contributions here are focused, well-tested, and easy to review.
Before You Start
- Search existing issues and discussions before opening a new thread.
- Check open pull requests for work that overlaps with your contribution. If a PR already exists that addresses the same change, open an issue or discussion first to align on direction — duplicate PRs may be closed without review.
- Use issues for confirmed bugs and actionable feature work.
- Use discussions for setup help, ideas, and general community conversation.
- For larger changes, open an issue first so the scope is clear before implementation.
- For security reports, follow SECURITY.md.
Pull Requests
Every PR needs a reason. Your PR description must include:
- what changed and why
- the user or developer impact
- the exact checks you ran
- a linked issue when one exists, using
Fixes#123, `Closes `#123, or another clear link- screenshots when the PR touches UI, terminal presentation, or the VS Code extension
- which provider path was tested when the PR changes provider behavior
The PR author is responsible for ensuring their PR is merge-ready. PRs with merge conflicts will not be reviewed or approved until the conflicts are resolved.
Issues are the recommended starting point for anything non-trivial — opening one first helps avoid wasted effort if the change is out of scope or already being worked on. Small fixes, doc corrections, and obvious improvements can stand on their own without a linked issue, as long as the PR description explains the intent.
What Gets Closed Without Review
PRs may be closed without review...
Files:
src/services/compact/compact.tssrc/services/api/openaiShim.test.tssrc/services/api/openaiShim.tssrc/utils/betas.test.tssrc/utils/betas.tssrc/services/compact/compact.test.ts
{src/services/api/**,src/integrations/**,src/utils/model/**,src/utils/provider*.ts,src/commands/provider/**}
⚙️ CodeRabbit configuration file
{src/services/api/**,src/integrations/**,src/utils/model/**,src/utils/provider*.ts,src/commands/provider/**}: Review provider routing, model selection, env precedence, auth/token handling, OpenAI-compatible shims, retries, proxy behavior, and outbound HTTP behavior with high scrutiny. Block on silent default changes, hidden fallback expansion, credential reuse mistakes, hardcoded provider assumptions, or new network reach that is not intentional and documented.
Files:
src/services/api/openaiShim.test.tssrc/services/api/openaiShim.ts
{src/**/*.test.ts,src/**/*.test.tsx,tests/**,scripts/**/*.test.ts,vscode-extension/**/*.test.js}
⚙️ CodeRabbit configuration file
{src/**/*.test.ts,src/**/*.test.tsx,tests/**,scripts/**/*.test.ts,vscode-extension/**/*.test.js}: Review tests for meaningful coverage of the changed behavior, isolation of global/env/config state, async cleanup, fake timers, provider profile leaks, and Windows-compatible assumptions. Block when risky runtime changes lack focused regression coverage or tests assert implementation details while missing the user-visible behavior.
Files:
src/services/api/openaiShim.test.tssrc/utils/betas.test.tssrc/services/compact/compact.test.ts
🔇 Additional comments (3)
src/services/api/openaiShim.ts (1)
589-590: LGTM!src/services/api/openaiShim.test.ts (1)
5099-5134: LGTM!src/utils/betas.ts (1)
397-410: LGTM!
- betas.test.ts: add isAnthropicProvider tests for mistral, xai, minimax - compact.test.ts: verify compactConversation skips forked-agent cache-sharing for non-Anthropic providers and uses it for Anthropic These tests pin the provider gates added in PR Gitlawb#1533: - getMergedBetas() returns [] for non-Anthropic (betas.test.ts) - compactConversation()/streamCompactSummary() skip cache-sharing when isAnthropicProvider() returns false (compact.test.ts)
resolveEnvOnlyProviderRouteId (src/integrations/routeMetadata.ts:428) returns
'venice' or 'xiaomi-mimo' when those env vars are set, even when the test
sets CLAUDE_CODE_USE_OPENAI=1 — its 'env-only' check runs BEFORE the
USE_OPENAI branch (L658). Without clearing these, leaked values from
earlier tests in the same process cause isAnthropicProvider() to return
the wrong value, so the new provider-gate assertions fail:
getMergedBetas returns [] for the openai provider
expected: []
received: [claude-code-20250219, interleaved-thinking-2025-05-14,
context-management-2025-06-27, prompt-caching-scope-2026-01-05]
isAnthropicProvider is false for the openai/gemini/mistral/xai/minimax
expected: false received: true
compactConversation provider gate > uses forked-agent cache-sharing
for Anthropic providers
expected: runForkedAgent called received: 0 calls
The previous 'CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED[_ID]' fix was
insufficient because those vars don't gate the env-only routes directly;
VENICE_API_KEY / MIMO_API_KEY do. Add them to the per-test cleanup in
both betas.test.ts (PROVIDER_ENV_KEYS) and compact.test.ts (SAVED_ENV).
Skipped: pre-existing CI failures unrelated to this PR's provider gate
(autoCompactIfNeeded circuit breaker × 7, getEffectiveContextWindowSize
on MiniMax M2, microCompact MCP, /export direct filename, getProjectMemoryPathForSelector).
These fail on origin/main with the same pre-PR failures and are out of
scope for Gitlawb#1533.
6652f87 to
e67bba3
Compare
jatmn
left a comment
There was a problem hiding this comment.
I found a couple of issues that need to be addressed before this is ready.
Findings
-
[P1] Fix the new compact test UUIDs before typecheck can pass
src/services/compact/compact.test.ts:26
The newuserMessage()andassistantMessage()helpers returnMessageobjects, but theiruuidfields are generated astest-${Math.random()}.Message.uuidis typed as a UUID-shaped template literal, so the current branch fails the project typecheck in this changed file (TS2322at lines 26 and 38). Please use UUID-shaped fixture values or the project helper used by nearby tests so the new provider-gate coverage does not break thetypecheckjob. -
[P2] Complete CodeRabbit's GitHub non-Claude beta regression request
src/utils/betas.test.ts:136
CodeRabbit's current review item is still valid: the tests cover the GitHub native Anthropic exception when the model is Claude, but there is no sibling assertion forCLAUDE_CODE_USE_GITHUB=1with a non-Claude model returning[]. That untested branch is the risky half of this provider gate because a future broadening ofisGithubNativeAnthropicMode()would reintroduce Anthropic beta headers for GitHub/OpenAI-style models. Please complete the CodeRabbit request by adding the non-Claude GitHub regression case.
CodeRabbit round 10 nitpick: the comment at compact.ts L434-439 still claims "3P default: true" and frames the GB flag as a "kill-switch", but the code is now gated by isAnthropicProvider() so non-Anthropic 3P providers are never on this path in the first place. Update both the compactConversation() block and the streamCompactSummary() block (now points at the first block instead of a separate stale sentence) to describe the actual current behavior: cache-sharing is enabled only for Anthropic-capable providers when tengu_compact_cache_prefix is on; non-Anthropic providers remain incompatible and would send Anthropic-only params they reject. Verified: 2/2 compact.test.ts still passes; no behavioral change, only the comment text. Skipped: rebase to current origin/main (Gitlawb#1533 was rebased earlier in 4c1b4c2; no new conflicts since).
jatmn
left a comment
There was a problem hiding this comment.
Thanks for the updates. I rechecked the changed paths and found an issue that still needs to be addressed.
Findings
- [P1] Finish isolating the provider-gate tests from leaked provider env
src/utils/betas.test.ts:20
The latest smoke job is still failing in the new provider-gate coverage, and I can reproduce it on the GitHub merge ref withbun run check: the firstParty/Bedrock/Vertex/Foundry beta assertions and the compact Anthropic cache-sharing assertion fail after earlier tests have run. The cleanup lists here and insrc/services/compact/compact.test.tsare still incomplete; for example,FIREWORKS_API_KEYis now an env-only provider signal viaresolveEnvOnlyProviderRouteId(), but neither test clears/restores it, so earlier provider-profile/client tests can leave the process classified as a non-Anthropic provider even when these tests expect a clean firstParty default. Please complete the still-valid test-isolation fix by clearing all provider-selection inputs used by the shared provider resolver, or by avoiding liveprocess.envfor these assertions, so the requiredsmoke-and-tests/bun run checkpath is green instead of only the focused files.
- betas.test.ts: add isAnthropicProvider tests for mistral, xai, minimax - compact.test.ts: verify compactConversation skips forked-agent cache-sharing for non-Anthropic providers and uses it for Anthropic These tests pin the provider gates added in PR Gitlawb#1533: - getMergedBetas() returns [] for non-Anthropic (betas.test.ts) - compactConversation()/streamCompactSummary() skip cache-sharing when isAnthropicProvider() returns false (compact.test.ts)
65ff2f7 to
5095f76
Compare
jatmn
left a comment
There was a problem hiding this comment.
Thanks for the updates. I rechecked the changed paths and found an issue that still needs to be addressed.
Findings
- [P1] Finish isolating the provider-gate tests before relying on CI
src/utils/betas.test.ts:101
The latestsmoke-and-testsjob is still red in the new provider-gate coverage: under CI'sbun run check, the OpenAI/Gemini/GitHub non-Claude beta tests receive the Anthropic beta list instead of[], and the compact non-Anthropic cache-sharing test also fails. The current file pre-warmsimportFreshBetas()before any test sets provider env, so on the CI Bun/runtime path the later cache-busted imports are not actually giving each test a clean provider resolver; they keep seeing the default firstParty state. Also, the cleanup lists still do not cover the full provider resolver surface fromresolveEnvOnlyProviderRouteId()(for exampleFIREWORKS_API_KEY/ATLAS_CLOUD_API_KEY), so the prior provider-env isolation request remains only partially completed. Please complete that isolation fix by avoiding the pre-warmed live module state or explicitly clearing the relevant betas/provider caches, and by snapshotting/clearing all provider-selection env inputs used by the shared resolver so the requiredsmoke-and-tests/bun run checkpath is green.
CI smoke run on Gitlawb#1533 still shows the openai/gemini/bedrock tests failing in `getAPIProvider` even after the pre-warm + clearProviderEnv fix from 5095f76. Tracing through the call chain reveals the missing env var: 1. `getAPIProvider()` first checks `CLAUDE_CODE_USE_FOUNDRY` (not set, so passes through). It then calls `resolveActiveRouteIdFromEnv(process.env)`. 2. `resolveActiveRouteIdFromEnv` checks the standard `CLAUDE_CODE_USE_*` env vars (all cleared by `clearProviderEnv()`). It then calls `resolveEnvOnlyProviderRouteId(process.env)`. 3. `resolveEnvOnlyProviderRouteId` calls `hasFireworksEnvOnlyProviderIntent`, which returns true when `FIREWORKS_API_KEY` is set AND none of the conflicting API keys are set. 4. When the intent is true, `resolveEnvOnlyProviderRouteId` returns the string `'fireworks'`. The caller (`getAPIProvider`) feeds this into its switch statement, which has no `'fireworks'` case, so it falls into the `default:` branch and returns `'firstParty'`. 5. `isAnthropicProvider()` sees `'firstParty'`, returns true. 6. `getMergedBetas()` takes the anthropic-list branch and returns the betas list instead of `[]`. So a leaked `FIREWORKS_API_KEY` from a prior test (PR Gitlawb#1590 added fireworks to upstream `routeMetadata.ts` as the 6th `*_API_KEY` env var) silently defeats every test in `betas.test.ts` that expects a non-anthropic provider. The previous fix (`65ff2f76`) added `NEARAI_API_KEY`; the Fireworks equivalent was missed because the FIREWORKS route is "env-only" (no `CLAUDE_CODE_USE_FIREWORKS` toggle) and was added more recently in PR Gitlawb#1590. This commit adds `FIREWORKS_API_KEY` to both `PROVIDER_ENV_KEYS` in `betas.test.ts` and `SAVED_ENV` in `compact.test.ts`. Verified: - `bun test --max-concurrency=1 src/utils/betas.test.ts src/services/compact/compact.test.ts` — 21/21 pass deterministically. - A synthetic leak test (sets `FIREWORKS_API_KEY='leaked'` then runs betas.test.ts) confirms the openai/gemini/bedrock/etc. tests now return `[]` and `isAnthropicProvider()` returns `false` even when the key is present in the parent process's env. Without the fix, the test would return the anthropic beta list. After this commit, the CI smoke-and-tests job on Gitlawb#1533 should turn green (the env-var-leak root cause is finally fully covered — the only remaining smoke failures should be the same pre-existing ones observed on origin/main: model cap tests for deepseek/gpt-4o/MiniMax-M2.7, the `showCacheStats` registration test, etc.).
The smoke suite runs many test files in one process. Snapshotting process.env at module load time captures values leaked by earlier test files, so restoring from that snapshot re-introduces the leaks between our tests. Delete every provider/profile env var before and after each test instead; each test now starts from a known-clean state and sets only the vars it needs.
842e42f to
5964489
Compare
5964489 to
4947145
Compare
08aa9cb to
d78f113
Compare
d78f113 to
c30e4c7
Compare
jatmn
left a comment
There was a problem hiding this comment.
Thanks for the updates. I rechecked the changed paths and found an issue that still needs to be addressed.
Findings
- [P1] Finish cleaning up the compact test mocks before smoke can pass
src/services/compact/compact.test.ts:469
The new compact test still leaves process-global mocks installed after the file finishes. In particular, it registers../../utils/projectInstructions.jswithgetProjectInstructionFilePaths: () => [], but theafterAllcleanup does not restore that module. When the requiredtest:fullorder later reachessrc/tools/AgentTool/runAgent.routing.test.ts, CLAUDE.md discovery receives an undefined project instruction path and crashes inprocessMemoryFile()before the runAgent assertions execute. I reproduced the isolation leak withbun test src\services\compact\compact.test.ts src\tools\AgentTool\runAgent.routing.test.ts --max-concurrency=1; the runAgent file passes by itself, but fails after this compact test, and the currentsmoke-and-testsjob is red on those same two runAgent provider-routing tests. Please complete the prior test-isolation request by restoring every module this fixture mocks, or by avoiding process-global mocks that can outlive the compact tests.
The importCompact() harness replaces these modules wholesale via
mock.module(), but afterAll only restored a subset. The
projectInstructions stub exported only getProjectInstructionFilePaths,
so every other export was undefined process-wide after this file ran.
In the full test:full order, runAgent.routing.test.ts then crashed in
processMemoryFile() with "path must be of type string, got undefined",
turning smoke-and-tests red.
Restore the real path, config, and projectInstructions modules in
afterAll alongside the others (all three were already pre-imported for
this purpose). Verified with:
bun test src/services/compact/compact.test.ts \
src/tools/AgentTool/runAgent.routing.test.ts --max-concurrency=1
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Fixed the failing Root cause:
Re: the two open CodeRabbit threads — the GitHub non-Claude |
jatmn
left a comment
There was a problem hiding this comment.
Thanks for the updates. I rechecked the changed paths and found one remaining non-blocking issue.
Findings
- [P3] Preserve cache sharing for GitHub Native Anthropic compaction
src/services/compact/compact.ts:447
The new compaction gate treats everygetAPIProvider() === 'github'session as non-Anthropic becauseisAnthropicProvider()only returns true for firstParty/Bedrock/Vertex/Foundry. But this PR's beta gate already preserves Anthropic betas forisGithubNativeAnthropicMode(model), andgetAnthropicClient()routes that mode through the native Anthropic client socache_control/prompt caching works. WithCLAUDE_CODE_USE_GITHUB=1and a Claude model, compaction now always bypasses forked-agent cache sharing at lines 447 and 1174 even though that provider path can use it. That reintroduces the cold-cache compaction cost the forked path was designed to avoid. Please include GitHub Native Anthropic in the compaction compatibility predicate and add a compact test for it, rather than usingisAnthropicProvider()alone.
The compaction cache-sharing gate used isAnthropicProvider() alone, which is false for getAPIProvider() === 'github'. But GitHub Native Anthropic mode (CLAUDE_CODE_USE_GITHUB=1 with a Claude model) routes through the native Anthropic client where cache_control / prompt caching works, and the beta-header gate already preserves Anthropic betas for it. As a result compaction always took the cold-cache path for those sessions, reintroducing the cost the forked-agent flow avoids. Add isCompactionCacheSharingCompatible(model) = isAnthropicProvider() || isGithubNativeAnthropicMode(model), mirroring the betas.ts gate, and use it at both compactConversation() and streamCompactSummary() sites. Adds a compact provider-gate test for GitHub Native Anthropic mode. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Addressed the P3 in Added a shared predicate Added a compact provider-gate test asserting |
jatmn
left a comment
There was a problem hiding this comment.
Thanks for the updates. I rechecked the previously discussed paths and do not see any remaining actionable issues from my side.
@kevincodex1 LGTM
…itlawb#248) (Gitlawb#1533) * fix(session): filter Anthropic-only params from 3P provider requests (Gitlawb#248) * test: add regression tests for provider gates (PR Gitlawb#1533) - betas.test.ts: add isAnthropicProvider tests for mistral, xai, minimax - compact.test.ts: verify compactConversation skips forked-agent cache-sharing for non-Anthropic providers and uses it for Anthropic These tests pin the provider gates added in PR Gitlawb#1533: - getMergedBetas() returns [] for non-Anthropic (betas.test.ts) - compactConversation()/streamCompactSummary() skip cache-sharing when isAnthropicProvider() returns false (compact.test.ts) * test: fix leaky mock harness and add redacted_thinking coverage - Complete all missing module exports in compact.test.ts mock harness to prevent global mock leakage breaking other test files - Add afterAll cleanup hook for safety - Add regression test for redacted_thinking blocks stripped before OpenAI-compatible replay in openaiShim * test: eliminate mock.module leaks causing 10 CI failures Replace broad mock.module() stubs for betas/providers/envUtils with env-var-based provider control. Add mock.restore() safety nets to betas.test.ts and autoCompact.test.ts. Tests now use real modules without cross-test-file mock contamination. * test: clear provider profile env vars and remove aggressive mock.restore - Snapshot and clear CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED flags so isAnthropicProvider() returns correct values regardless of provider profile state from previous tests - Remove mock.restore() from betas.test.ts and autoCompact.test.ts that was breaking legitimate mocks in other test files * test(betas): ensure provider env vars are cleared in afterEach, not just beforeEach * test(3p): also clear VENICE_API_KEY and MIMO_API_KEY per test resolveEnvOnlyProviderRouteId (src/integrations/routeMetadata.ts:428) returns 'venice' or 'xiaomi-mimo' when those env vars are set, even when the test sets CLAUDE_CODE_USE_OPENAI=1 — its 'env-only' check runs BEFORE the USE_OPENAI branch (L658). Without clearing these, leaked values from earlier tests in the same process cause isAnthropicProvider() to return the wrong value, so the new provider-gate assertions fail: getMergedBetas returns [] for the openai provider expected: [] received: [claude-code-20250219, interleaved-thinking-2025-05-14, context-management-2025-06-27, prompt-caching-scope-2026-01-05] isAnthropicProvider is false for the openai/gemini/mistral/xai/minimax expected: false received: true compactConversation provider gate > uses forked-agent cache-sharing for Anthropic providers expected: runForkedAgent called received: 0 calls The previous 'CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED[_ID]' fix was insufficient because those vars don't gate the env-only routes directly; VENICE_API_KEY / MIMO_API_KEY do. Add them to the per-test cleanup in both betas.test.ts (PROVIDER_ENV_KEYS) and compact.test.ts (SAVED_ENV). Skipped: pre-existing CI failures unrelated to this PR's provider gate (autoCompactIfNeeded circuit breaker × 7, getEffectiveContextWindowSize on MiniMax M2, microCompact MCP, /export direct filename, getProjectMemoryPathForSelector). These fail on origin/main with the same pre-PR failures and are out of scope for Gitlawb#1533. * test(3p): use crypto.randomUUID() for message uuids The Message type's uuid field is now a branded UUID type from node:crypto (strings must match the 5-segment UUID format). After rebase onto the latest origin/main, the typecheck failed at L26 and L38 where \\ est-\\\ no longer satisfies the type. crypto.randomUUID() produces a valid UUID per call, so each message gets a unique, type-correct uuid. No behavioral change for the tests themselves — they only rely on the messages being distinct. * test(3p): cast assistantMessage.message through 'as never' The Message type's assistant message field is AssistantMessageContent <BetaContentBlock>, which requires id, model, usage at the type level. The compact code paths under test don't read these fields, so the helper focused on the text content needs to bypass the type check. Pre-rebase this likely passed via a wider union; after the rebase the branded type tightened. 'as never' keeps the helper minimal without having to fabricate realistic id/model/usage values. * test(betas): add non-Claude GitHub regression test for provider gate CodeRabbit P2 review on the most recent round: the provider gate tests cover the GitHub Native Anthropic exception (Claude model) but no sibling assertion for CLAUDE_CODE_USE_GITHUB=1 with a non-Claude model. That untested branch is the risky half of the gate — a future broadening of isGithubNativeAnthropicMode() (e.g. matching on the wrong substring, or matching on 'claude' too permissively) would silently re-introduce Anthropic-only beta headers for OpenAI-style models served via GitHub. Add the inverse case: GitHub provider with OPENAI_MODEL='gpt-4o-mini' must return [] (the gate strips the headers). Verified locally: betas.test.ts runs 19/19 (with the new test) and the betas+compact pair runs 21/21 cleanly, confirming no test-isolation regression between the two files. * docs(compact): update stale "3P default: true" comments CodeRabbit round 10 nitpick: the comment at compact.ts L434-439 still claims "3P default: true" and frames the GB flag as a "kill-switch", but the code is now gated by isAnthropicProvider() so non-Anthropic 3P providers are never on this path in the first place. Update both the compactConversation() block and the streamCompactSummary() block (now points at the first block instead of a separate stale sentence) to describe the actual current behavior: cache-sharing is enabled only for Anthropic-capable providers when tengu_compact_cache_prefix is on; non-Anthropic providers remain incompatible and would send Anthropic-only params they reject. Verified: 2/2 compact.test.ts still passes; no behavioral change, only the comment text. Skipped: rebase to current origin/main (Gitlawb#1533 was rebased earlier in 4c1b4c2; no new conflicts since). * test(3p): also clear NEARAI_API_KEY per test Reproduce locally with `bun test --max-concurrency=1` and find that `providerValidation.test.ts` (which runs alongside the other 3800+ test files in the same CI process) sets NEARAI_API_KEY for one of its neearai-detect tests, and the previous cleanup list missed that env var. So when betas.test.ts and compact.test.ts later run their `isAnthropicProvider is true for firstParty` (and the other `isAnthropicProvider is true for *`) tests, hasNearaiEnvOnlyProviderIntent() returns true on the leaked value, the function classifies the provider as 'nearai' (not Anthropic), and the gate strips all betas. The test that sets no env var then sees `getMergedBetas()` return `[]` instead of the Anthropic list, which is the failure pattern on CI. resolveEnvOnlyProviderRouteId (src/integrations/routeMetadata.ts:428) returns 'nearai' (or 'xai', 'minimax', 'venice', 'xiaomi-mimo') for those env var leak paths, all of which were missed by the prior profile-env-var fix. The previous CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED fix was insufficient because that var doesn't gate the env-only routes directly — only the API keys do. NEARAI_API_KEY is the latest in the family (xai/minimax/venice/mimo all already in the list). Add NEARAI_API_KEY to PROVIDER_ENV_KEYS in betas.test.ts and SAVED_ENV in compact.test.ts, matching the xai/minimax/venice/mimo additions from 768ab65. Verified: betas.test.ts (19/19) and compact.test.ts (2/2) still pass in isolation and as a pair. The full-suite failure should now resolve since the leak path is covered. Skipped: tracing the exact leak site. The 3800+ test files can't all be enumerated; the fix is comprehensive (all known env-only intent vars are now in the clear list). * test(3p): pre-warm importFreshBetas in beforeAll to avoid 5s timeout The first call to importFreshBetas() in this file triggered a 4.3s module load (growthbook feature-flag init), which exceeded Bun's default 5s test timeout. On the agent's local Windows machine, that race is close to deterministic (the openai test, being first in source order, paid the full cost and reported `(fail) this test timed out after 5000ms`). On CI (Ubuntu) the import was usually fast enough to pass — but flaky, and completely unrelated to actual test correctness. Move the slow import to beforeAll so it happens once before any test runs and is not charged to the first test's budget. After pre-warming, every per-test import is sub-second (the 579ms / 8ms timings in the last run confirm this). All 21 tests in the betas+compact pair now pass deterministically. This is the only remaining source-order flake in this file. The actual test logic (provider-gate assertions, env-var isolation, cache-busting imports) was already correct. * test(3p): add FIREWORKS_API_KEY to provider env cleanup CI smoke run on Gitlawb#1533 still shows the openai/gemini/bedrock tests failing in `getAPIProvider` even after the pre-warm + clearProviderEnv fix from 5095f76. Tracing through the call chain reveals the missing env var: 1. `getAPIProvider()` first checks `CLAUDE_CODE_USE_FOUNDRY` (not set, so passes through). It then calls `resolveActiveRouteIdFromEnv(process.env)`. 2. `resolveActiveRouteIdFromEnv` checks the standard `CLAUDE_CODE_USE_*` env vars (all cleared by `clearProviderEnv()`). It then calls `resolveEnvOnlyProviderRouteId(process.env)`. 3. `resolveEnvOnlyProviderRouteId` calls `hasFireworksEnvOnlyProviderIntent`, which returns true when `FIREWORKS_API_KEY` is set AND none of the conflicting API keys are set. 4. When the intent is true, `resolveEnvOnlyProviderRouteId` returns the string `'fireworks'`. The caller (`getAPIProvider`) feeds this into its switch statement, which has no `'fireworks'` case, so it falls into the `default:` branch and returns `'firstParty'`. 5. `isAnthropicProvider()` sees `'firstParty'`, returns true. 6. `getMergedBetas()` takes the anthropic-list branch and returns the betas list instead of `[]`. So a leaked `FIREWORKS_API_KEY` from a prior test (PR Gitlawb#1590 added fireworks to upstream `routeMetadata.ts` as the 6th `*_API_KEY` env var) silently defeats every test in `betas.test.ts` that expects a non-anthropic provider. The previous fix (`65ff2f76`) added `NEARAI_API_KEY`; the Fireworks equivalent was missed because the FIREWORKS route is "env-only" (no `CLAUDE_CODE_USE_FIREWORKS` toggle) and was added more recently in PR Gitlawb#1590. This commit adds `FIREWORKS_API_KEY` to both `PROVIDER_ENV_KEYS` in `betas.test.ts` and `SAVED_ENV` in `compact.test.ts`. Verified: - `bun test --max-concurrency=1 src/utils/betas.test.ts src/services/compact/compact.test.ts` — 21/21 pass deterministically. - A synthetic leak test (sets `FIREWORKS_API_KEY='leaked'` then runs betas.test.ts) confirms the openai/gemini/bedrock/etc. tests now return `[]` and `isAnthropicProvider()` returns `false` even when the key is present in the parent process's env. Without the fix, the test would return the anthropic beta list. After this commit, the CI smoke-and-tests job on Gitlawb#1533 should turn green (the env-var-leak root cause is finally fully covered — the only remaining smoke failures should be the same pre-existing ones observed on origin/main: model cap tests for deepseek/gpt-4o/MiniMax-M2.7, the `showCacheStats` registration test, etc.). * test(3p): scrub provider env vars instead of restoring leaked snapshot The smoke suite runs many test files in one process. Snapshotting process.env at module load time captures values leaked by earlier test files, so restoring from that snapshot re-introduces the leaks between our tests. Delete every provider/profile env var before and after each test instead; each test now starts from a known-clean state and sets only the vars it needs. * debug(3p): add env/provider diagnostics to openai beta test * test(3p): restore real providers.js in betas/compact tests * test(3p): fix providers, diskOutput, and messages mock leaks * test(compact): restore projectInstructions/path/config mocks in afterAll The importCompact() harness replaces these modules wholesale via mock.module(), but afterAll only restored a subset. The projectInstructions stub exported only getProjectInstructionFilePaths, so every other export was undefined process-wide after this file ran. In the full test:full order, runAgent.routing.test.ts then crashed in processMemoryFile() with "path must be of type string, got undefined", turning smoke-and-tests red. Restore the real path, config, and projectInstructions modules in afterAll alongside the others (all three were already pre-imported for this purpose). Verified with: bun test src/services/compact/compact.test.ts \ src/tools/AgentTool/runAgent.routing.test.ts --max-concurrency=1 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(compact): keep cache-sharing for GitHub Native Anthropic compaction The compaction cache-sharing gate used isAnthropicProvider() alone, which is false for getAPIProvider() === 'github'. But GitHub Native Anthropic mode (CLAUDE_CODE_USE_GITHUB=1 with a Claude model) routes through the native Anthropic client where cache_control / prompt caching works, and the beta-header gate already preserves Anthropic betas for it. As a result compaction always took the cold-cache path for those sessions, reintroducing the cost the forked-agent flow avoids. Add isCompactionCacheSharingCompatible(model) = isAnthropicProvider() || isGithubNativeAnthropicMode(model), mirroring the betas.ts gate, and use it at both compactConversation() and streamCompactSummary() sites. Adds a compact provider-gate test for GitHub Native Anthropic mode. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
…itlawb#248) (Gitlawb#1533) * fix(session): filter Anthropic-only params from 3P provider requests (Gitlawb#248) * test: add regression tests for provider gates (PR Gitlawb#1533) - betas.test.ts: add isAnthropicProvider tests for mistral, xai, minimax - compact.test.ts: verify compactConversation skips forked-agent cache-sharing for non-Anthropic providers and uses it for Anthropic These tests pin the provider gates added in PR Gitlawb#1533: - getMergedBetas() returns [] for non-Anthropic (betas.test.ts) - compactConversation()/streamCompactSummary() skip cache-sharing when isAnthropicProvider() returns false (compact.test.ts) * test: fix leaky mock harness and add redacted_thinking coverage - Complete all missing module exports in compact.test.ts mock harness to prevent global mock leakage breaking other test files - Add afterAll cleanup hook for safety - Add regression test for redacted_thinking blocks stripped before OpenAI-compatible replay in openaiShim * test: eliminate mock.module leaks causing 10 CI failures Replace broad mock.module() stubs for betas/providers/envUtils with env-var-based provider control. Add mock.restore() safety nets to betas.test.ts and autoCompact.test.ts. Tests now use real modules without cross-test-file mock contamination. * test: clear provider profile env vars and remove aggressive mock.restore - Snapshot and clear CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED flags so isAnthropicProvider() returns correct values regardless of provider profile state from previous tests - Remove mock.restore() from betas.test.ts and autoCompact.test.ts that was breaking legitimate mocks in other test files * test(betas): ensure provider env vars are cleared in afterEach, not just beforeEach * test(3p): also clear VENICE_API_KEY and MIMO_API_KEY per test resolveEnvOnlyProviderRouteId (src/integrations/routeMetadata.ts:428) returns 'venice' or 'xiaomi-mimo' when those env vars are set, even when the test sets CLAUDE_CODE_USE_OPENAI=1 — its 'env-only' check runs BEFORE the USE_OPENAI branch (L658). Without clearing these, leaked values from earlier tests in the same process cause isAnthropicProvider() to return the wrong value, so the new provider-gate assertions fail: getMergedBetas returns [] for the openai provider expected: [] received: [claude-code-20250219, interleaved-thinking-2025-05-14, context-management-2025-06-27, prompt-caching-scope-2026-01-05] isAnthropicProvider is false for the openai/gemini/mistral/xai/minimax expected: false received: true compactConversation provider gate > uses forked-agent cache-sharing for Anthropic providers expected: runForkedAgent called received: 0 calls The previous 'CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED[_ID]' fix was insufficient because those vars don't gate the env-only routes directly; VENICE_API_KEY / MIMO_API_KEY do. Add them to the per-test cleanup in both betas.test.ts (PROVIDER_ENV_KEYS) and compact.test.ts (SAVED_ENV). Skipped: pre-existing CI failures unrelated to this PR's provider gate (autoCompactIfNeeded circuit breaker × 7, getEffectiveContextWindowSize on MiniMax M2, microCompact MCP, /export direct filename, getProjectMemoryPathForSelector). These fail on origin/main with the same pre-PR failures and are out of scope for Gitlawb#1533. * test(3p): use crypto.randomUUID() for message uuids The Message type's uuid field is now a branded UUID type from node:crypto (strings must match the 5-segment UUID format). After rebase onto the latest origin/main, the typecheck failed at L26 and L38 where \\ est-\\\ no longer satisfies the type. crypto.randomUUID() produces a valid UUID per call, so each message gets a unique, type-correct uuid. No behavioral change for the tests themselves — they only rely on the messages being distinct. * test(3p): cast assistantMessage.message through 'as never' The Message type's assistant message field is AssistantMessageContent <BetaContentBlock>, which requires id, model, usage at the type level. The compact code paths under test don't read these fields, so the helper focused on the text content needs to bypass the type check. Pre-rebase this likely passed via a wider union; after the rebase the branded type tightened. 'as never' keeps the helper minimal without having to fabricate realistic id/model/usage values. * test(betas): add non-Claude GitHub regression test for provider gate CodeRabbit P2 review on the most recent round: the provider gate tests cover the GitHub Native Anthropic exception (Claude model) but no sibling assertion for CLAUDE_CODE_USE_GITHUB=1 with a non-Claude model. That untested branch is the risky half of the gate — a future broadening of isGithubNativeAnthropicMode() (e.g. matching on the wrong substring, or matching on 'claude' too permissively) would silently re-introduce Anthropic-only beta headers for OpenAI-style models served via GitHub. Add the inverse case: GitHub provider with OPENAI_MODEL='gpt-4o-mini' must return [] (the gate strips the headers). Verified locally: betas.test.ts runs 19/19 (with the new test) and the betas+compact pair runs 21/21 cleanly, confirming no test-isolation regression between the two files. * docs(compact): update stale "3P default: true" comments CodeRabbit round 10 nitpick: the comment at compact.ts L434-439 still claims "3P default: true" and frames the GB flag as a "kill-switch", but the code is now gated by isAnthropicProvider() so non-Anthropic 3P providers are never on this path in the first place. Update both the compactConversation() block and the streamCompactSummary() block (now points at the first block instead of a separate stale sentence) to describe the actual current behavior: cache-sharing is enabled only for Anthropic-capable providers when tengu_compact_cache_prefix is on; non-Anthropic providers remain incompatible and would send Anthropic-only params they reject. Verified: 2/2 compact.test.ts still passes; no behavioral change, only the comment text. Skipped: rebase to current origin/main (Gitlawb#1533 was rebased earlier in 4c1b4c2; no new conflicts since). * test(3p): also clear NEARAI_API_KEY per test Reproduce locally with `bun test --max-concurrency=1` and find that `providerValidation.test.ts` (which runs alongside the other 3800+ test files in the same CI process) sets NEARAI_API_KEY for one of its neearai-detect tests, and the previous cleanup list missed that env var. So when betas.test.ts and compact.test.ts later run their `isAnthropicProvider is true for firstParty` (and the other `isAnthropicProvider is true for *`) tests, hasNearaiEnvOnlyProviderIntent() returns true on the leaked value, the function classifies the provider as 'nearai' (not Anthropic), and the gate strips all betas. The test that sets no env var then sees `getMergedBetas()` return `[]` instead of the Anthropic list, which is the failure pattern on CI. resolveEnvOnlyProviderRouteId (src/integrations/routeMetadata.ts:428) returns 'nearai' (or 'xai', 'minimax', 'venice', 'xiaomi-mimo') for those env var leak paths, all of which were missed by the prior profile-env-var fix. The previous CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED fix was insufficient because that var doesn't gate the env-only routes directly — only the API keys do. NEARAI_API_KEY is the latest in the family (xai/minimax/venice/mimo all already in the list). Add NEARAI_API_KEY to PROVIDER_ENV_KEYS in betas.test.ts and SAVED_ENV in compact.test.ts, matching the xai/minimax/venice/mimo additions from 768ab65. Verified: betas.test.ts (19/19) and compact.test.ts (2/2) still pass in isolation and as a pair. The full-suite failure should now resolve since the leak path is covered. Skipped: tracing the exact leak site. The 3800+ test files can't all be enumerated; the fix is comprehensive (all known env-only intent vars are now in the clear list). * test(3p): pre-warm importFreshBetas in beforeAll to avoid 5s timeout The first call to importFreshBetas() in this file triggered a 4.3s module load (growthbook feature-flag init), which exceeded Bun's default 5s test timeout. On the agent's local Windows machine, that race is close to deterministic (the openai test, being first in source order, paid the full cost and reported `(fail) this test timed out after 5000ms`). On CI (Ubuntu) the import was usually fast enough to pass — but flaky, and completely unrelated to actual test correctness. Move the slow import to beforeAll so it happens once before any test runs and is not charged to the first test's budget. After pre-warming, every per-test import is sub-second (the 579ms / 8ms timings in the last run confirm this). All 21 tests in the betas+compact pair now pass deterministically. This is the only remaining source-order flake in this file. The actual test logic (provider-gate assertions, env-var isolation, cache-busting imports) was already correct. * test(3p): add FIREWORKS_API_KEY to provider env cleanup CI smoke run on Gitlawb#1533 still shows the openai/gemini/bedrock tests failing in `getAPIProvider` even after the pre-warm + clearProviderEnv fix from 5095f76. Tracing through the call chain reveals the missing env var: 1. `getAPIProvider()` first checks `CLAUDE_CODE_USE_FOUNDRY` (not set, so passes through). It then calls `resolveActiveRouteIdFromEnv(process.env)`. 2. `resolveActiveRouteIdFromEnv` checks the standard `CLAUDE_CODE_USE_*` env vars (all cleared by `clearProviderEnv()`). It then calls `resolveEnvOnlyProviderRouteId(process.env)`. 3. `resolveEnvOnlyProviderRouteId` calls `hasFireworksEnvOnlyProviderIntent`, which returns true when `FIREWORKS_API_KEY` is set AND none of the conflicting API keys are set. 4. When the intent is true, `resolveEnvOnlyProviderRouteId` returns the string `'fireworks'`. The caller (`getAPIProvider`) feeds this into its switch statement, which has no `'fireworks'` case, so it falls into the `default:` branch and returns `'firstParty'`. 5. `isAnthropicProvider()` sees `'firstParty'`, returns true. 6. `getMergedBetas()` takes the anthropic-list branch and returns the betas list instead of `[]`. So a leaked `FIREWORKS_API_KEY` from a prior test (PR Gitlawb#1590 added fireworks to upstream `routeMetadata.ts` as the 6th `*_API_KEY` env var) silently defeats every test in `betas.test.ts` that expects a non-anthropic provider. The previous fix (`65ff2f76`) added `NEARAI_API_KEY`; the Fireworks equivalent was missed because the FIREWORKS route is "env-only" (no `CLAUDE_CODE_USE_FIREWORKS` toggle) and was added more recently in PR Gitlawb#1590. This commit adds `FIREWORKS_API_KEY` to both `PROVIDER_ENV_KEYS` in `betas.test.ts` and `SAVED_ENV` in `compact.test.ts`. Verified: - `bun test --max-concurrency=1 src/utils/betas.test.ts src/services/compact/compact.test.ts` — 21/21 pass deterministically. - A synthetic leak test (sets `FIREWORKS_API_KEY='leaked'` then runs betas.test.ts) confirms the openai/gemini/bedrock/etc. tests now return `[]` and `isAnthropicProvider()` returns `false` even when the key is present in the parent process's env. Without the fix, the test would return the anthropic beta list. After this commit, the CI smoke-and-tests job on Gitlawb#1533 should turn green (the env-var-leak root cause is finally fully covered — the only remaining smoke failures should be the same pre-existing ones observed on origin/main: model cap tests for deepseek/gpt-4o/MiniMax-M2.7, the `showCacheStats` registration test, etc.). * test(3p): scrub provider env vars instead of restoring leaked snapshot The smoke suite runs many test files in one process. Snapshotting process.env at module load time captures values leaked by earlier test files, so restoring from that snapshot re-introduces the leaks between our tests. Delete every provider/profile env var before and after each test instead; each test now starts from a known-clean state and sets only the vars it needs. * debug(3p): add env/provider diagnostics to openai beta test * test(3p): restore real providers.js in betas/compact tests * test(3p): fix providers, diskOutput, and messages mock leaks * test(compact): restore projectInstructions/path/config mocks in afterAll The importCompact() harness replaces these modules wholesale via mock.module(), but afterAll only restored a subset. The projectInstructions stub exported only getProjectInstructionFilePaths, so every other export was undefined process-wide after this file ran. In the full test:full order, runAgent.routing.test.ts then crashed in processMemoryFile() with "path must be of type string, got undefined", turning smoke-and-tests red. Restore the real path, config, and projectInstructions modules in afterAll alongside the others (all three were already pre-imported for this purpose). Verified with: bun test src/services/compact/compact.test.ts \ src/tools/AgentTool/runAgent.routing.test.ts --max-concurrency=1 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(compact): keep cache-sharing for GitHub Native Anthropic compaction The compaction cache-sharing gate used isAnthropicProvider() alone, which is false for getAPIProvider() === 'github'. But GitHub Native Anthropic mode (CLAUDE_CODE_USE_GITHUB=1 with a Claude model) routes through the native Anthropic client where cache_control / prompt caching works, and the beta-header gate already preserves Anthropic betas for it. As a result compaction always took the cold-cache path for those sessions, reintroducing the cost the forked-agent flow avoids. Add isCompactionCacheSharingCompatible(model) = isAnthropicProvider() || isGithubNativeAnthropicMode(model), mirroring the betas.ts gate, and use it at both compactConversation() and streamCompactSummary() sites. Adds a compact provider-gate test for GitHub Native Anthropic mode. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> (cherry picked from commit db2d093)
Filter Anthropic-only params from 3P provider requests
Fixes #248. The community audit identified 5 bugs across 3 files that could
cause 400 errors or context corruption when sending requests to non-Anthropic
providers (OpenAI, OpenRouter, Ollama, etc.):
betas.ts: An emptybetasarray is now sent for non-Anthropic providers(previously the Anthropic beta header list was being sent to all providers).
openaiShim.ts:redacted_thinkingcontent blocks are now filtered frommessages before they reach OpenAI-compatible endpoints (previously
serialized verbatim, causing some providers to reject the request).
compact.ts: Cache-sharing compaction is disabled for non-Anthropicproviders (previously was unconditionally enabled, which can break providers
that don't support Anthropic's prompt cache format).
What changed
src/utils/betas.ts—isAnthropicProvider()guardsrc/utils/openaiShim.ts— filterredacted_thinkingblockssrc/utils/compact/autoCompact.ts+src/utils/compact/compact.ts— gatecache-sharing path on provider
src/utils/betas.test.ts— provider gating tests, env varisolation in
beforeEach/afterEachsrc/services/compact/autoCompact.test.tsandcompact.test.ts—provider profile env state cleanup
Testing
bun test src/utils/betas.test.ts— gating tests pass with properbeforeEach/afterEach env cleanup
bun test src/services/compact/autoCompact.test.tsandcompact.test.ts— provider profile env vars cleared in afterEach, notest bleed
bun run check— full suite green in focused test runsSummary by CodeRabbit