feat: Multi-Key API Key Rotation with Automatic Failover (KeyPool)#61
feat: Multi-Key API Key Rotation with Automatic Failover (KeyPool)#61ariel42 wants to merge 3 commits intoMadAppGang:mainfrom
Conversation
…e status code handling Implement KeyPool class providing round-robin rotation across multiple comma-separated API keys with transparent failover. Integrate into all provider handlers (Gemini, OpenAI, OpenRouter, Anthropic-compat, OllamaCloud, LiteLLM, remote-provider). Rotatable status codes (401, 402, 403, 408, 429, 500, 502, 503, 504) trigger key rotation; all other error codes propagate immediately. Includes 99 unit tests covering failover logic, body drain, index advancement, provider auth patterns, and edge cases. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…e status code handling Implement KeyPool class providing sticky-key rotation across multiple comma-separated API keys with transparent failover. Keys are reused until they fail — only errors trigger advancement to the next key. Integrate into all provider handlers (Gemini, OpenAI, OpenRouter, Anthropic-compat, OllamaCloud, LiteLLM, remote-provider). Rotatable status codes (401, 402, 403, 408, 429, 500, 502, 503, 504) trigger key rotation; all other error codes propagate immediately. Includes 99 unit tests covering failover logic, sticky-key behavior, body drain, index advancement, provider auth patterns, and edge cases. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add isInvalidApiKeyResponse() to KeyPool that inspects response bodies for provider-specific invalid key patterns (Gemini API_KEY_INVALID, OpenAI invalid_api_key, Anthropic authentication_error, and generic message matching). This enables key rotation even when the HTTP status code alone wouldn't trigger it (e.g., Gemini returns 400 for invalid keys). - Make isInvalidApiKeyResponse public for testability - Expand executeWithFailover to check body after status-code check - Add 11 mocked unit tests for all detection patterns and edge cases - Add 5 live canary tests hitting real provider APIs with invalid keys to detect API contract changes automatically - Sync key-pool.ts to packages/core and src/
|
The KeyPool core is well-designed. Round-robin with failover, body-draining between attempts, preserving the last response for error details, and the But the integration into handlers needs rework: Every handler has the same The changes are tripled across In The Kimi OAuth fallback is inside the single-key else branch in Also note that PR #70 (provider refactor) rewrites most of the same files. If that merges first, this PR will need a full rewrite against the new architecture. Might be worth waiting to see how #70 lands. The feature itself is valuable though. Happy to help figure out the right integration approach. |
|
Putting this on hold until PR #70 (provider refactor) lands. That PR rewrites every handler file this touches, so integrating KeyPool against the current architecture would just be throwaway work. The KeyPool class itself is solid and we'll use it. Once #70 gives us the new 3-layer architecture, the integration will be cleaner since there's one handler ( I'll ping you when #70 is in and we're ready to wire KeyPool into the new transport layer. |
Closes #60
Overview
Introduces KeyPool — a transparent multi-key rotation and failover system. Users supply multiple comma-separated API keys per provider, and Claudish automatically distributes requests across them with intelligent failover on rate limits, authentication errors, and transient failures.
Zero configuration — works with any existing single-key setup, no CLI changes needed.
Key Design Decisions
Failover Strategy
executeWithFailover(fetchFn)wraps each handler's fetch call:API_KEY_INVALID— not normally retryable, but the body reveals it's a key issue)Response Body-Based Invalid Key Detection
isInvalidApiKeyResponse()clones the response and checks for known patterns:error.details[].reason === "API_KEY_INVALID"error.code === "invalid_api_key"error.type === "authentication_error""API key not valid","invalid credentials", etc.)Non-JSON bodies gracefully return
false— zero risk of false positives.Resource Safety
Previous response bodies are drained between attempts to prevent connection and memory leaks. The last response body is preserved for the caller to read error details (quota info, retry-after hints, etc.).
Handler Changes
All 7 provider handlers updated to use KeyPool:
base-gemini-handler.ts/gemini-handler.tsopenai-handler.tsanthropic-compat-handler.tsopenrouter-handler.tsollamacloud-handler.tslitellm-handler.tsremote-provider-handler.tsEach handler injects the API key per-attempt via the failover callback rather than once at construction time.
Test Coverage (116 tests, 1,990 lines)
isInvalidApiKeyResponse()to catch provider API contract changesStats
37 files changed, 4,337 insertions, 321 deletions