-
Notifications
You must be signed in to change notification settings - Fork 87
feat: Multi-Key API Key Rotation with Automatic Failover (KeyPool) #60
Description
Summary
This feature introduces KeyPool, a transparent multi-key rotation and failover system for all API providers. Users can supply multiple API keys per provider via comma-separated environment variables, and Claudish will automatically distribute requests across keys with intelligent failover on errors.
# Before: single key, single point of failure
export GEMINI_API_KEY="key1"
# After: multiple keys, automatic rotation on errors
export GEMINI_API_KEY="key1,key2,key3"No CLI changes, no configuration flags — just add more keys to the same environment variable.
Motivation
API rate limits (HTTP 429) are a common friction point when using LLM providers heavily. When a key is rate-limited, the user must wait or manually switch keys. Similarly, expired or revoked keys cause hard failures with no recovery path. Multi-key rotation addresses this:
- Rate limit resilience: When one key hits 429, the next key is tried immediately
- Key lifecycle management: Expired or invalid keys are skipped automatically
- Zero-downtime key rotation: Add a new key to the list, remove old ones later
- Team and CI use cases: Distribute load across multiple API keys to stay within per-key quotas
Architecture
KeyPool Class
KeyPool is the core abstraction, located at handlers/shared/key-pool.ts:
- Initialization: Parses a comma-separated key string, trims whitespace, filters empty entries
- Round-robin rotation: Keys are tried in order; the index sticks on success and advances on failure
executeWithFailover(fetchFn): The main entry point — wraps a fetch function with automatic key rotation:- On rotatable HTTP errors (401, 402, 403, 408, 429, 500–504): rotates to the next key and retries
- On body-detected invalid keys (non-rotatable status codes like 400): inspects the response body for provider-specific invalid key patterns before giving up
- On network errors (thrown exceptions): rotates and retries
- On non-retryable errors (e.g., 400 validation errors): returns immediately without wasting remaining keys
- On all keys exhausted: returns the last failed response (preserving body for error details) or re-throws the last network error
- Resource safety: Drains previous response bodies to prevent connection and memory leaks, while preserving the last response for the caller
Response Body-Based Invalid Key Detection
Some providers return non-standard status codes for invalid API keys. For example, Gemini returns HTTP 400 (not normally retryable) with reason: "API_KEY_INVALID" in the response body. The isInvalidApiKeyResponse() method clones the response and inspects the body for known patterns:
| Provider | Status | Detection Pattern |
|---|---|---|
| Gemini | 400 | error.details[].reason === "API_KEY_INVALID" |
| OpenAI | 401 | error.code === "invalid_api_key" |
| Anthropic | 401 | error.type === "authentication_error" |
| Generic | varies | Message contains "API key not valid", "invalid api key", "invalid credentials", "incorrect api key" |
This provides defense-in-depth: even if a provider changes their error status code, the body-level check catches it. Non-JSON responses gracefully return false (no false positives).
Handler Integration
All 7 provider handlers have been updated to use KeyPool:
| Handler | Provider | Key Env Var |
|---|---|---|
base-gemini-handler.ts |
Gemini | GEMINI_API_KEY |
openai-handler.ts |
OpenAI | OPENAI_API_KEY |
anthropic-compat-handler.ts |
Anthropic, MiniMax, Kimi | Provider-specific |
openrouter-handler.ts |
OpenRouter | OPENROUTER_API_KEY |
ollamacloud-handler.ts |
OllamaCloud | OLLAMA_API_KEY |
litellm-handler.ts |
LiteLLM | LITELLM_API_KEY |
remote-provider-handler.ts |
Generic remote providers | Registry-defined |
Each handler wraps its fetch() call inside keyPool.executeWithFailover(), with the API key injected per-attempt rather than once at construction time.
Backward Compatibility
- Single key: KeyPool works transparently with a single key (no rotation, no extra logging)
- No key: Handlers that don't require keys continue to work unchanged
- Logging: Multi-key rotation is only logged when more than one key is configured, keeping single-key output clean
Test Coverage (116 tests)
Unit Tests (tests/key-pool.test.ts — 1,990 lines)
- Core KeyPool: Initialization (empty, single, multi-key, whitespace handling), round-robin rotation, index advancement, reset
- executeWithFailover: Success on first try, failover through keys, all-keys-exhausted (both HTTP and network errors), non-retryable status passthrough, response body draining, mixed error types
- Status code handling: Every code in
ROTATABLE_STATUS_CODESverified, non-rotatable codes confirmed as passthrough - Invalid key detection: Every detection pattern (Gemini, OpenAI, Anthropic, generic messages), negative cases (normal 400, non-JSON body), body preservation via clone, all-keys-exhausted with body detection
- Concurrency resilience: Concurrent
executeWithFailovercalls, large key pools (10+ keys), key exhaustion ordering
Live Provider Canary Tests (5 tests)
Real HTTP requests to Gemini, OpenAI, Anthropic, OpenRouter, and OllamaCloud with deliberately invalid API keys. Each response is piped through isInvalidApiKeyResponse() to verify the detection logic matches what the provider actually returns. If any provider changes their error format, the corresponding canary test fails — alerting maintainers to update the detection logic. This eliminates the risk of silent regression from provider API contract changes.
Files Changed (37)
packages/cli/src/handlers/shared/key-pool.ts— newKeyPoolclasspackages/cli/src/handlers/shared/gemini-retry.ts— Gemini-specific retry integrationpackages/cli/src/handlers/shared/remote-provider-handler.ts— generic remote provider KeyPool integrationpackages/cli/src/handlers/*.ts— all 7 handlers updated to use KeyPooltests/key-pool.test.ts— 116 tests (1,990 lines)tests/gemini-retry.test.ts— Gemini retry tests- Synced to
packages/core/andsrc/