Skip to content

feat: Multi-Key API Key Rotation with Automatic Failover (KeyPool) #60

@ariel42

Description

@ariel42

Summary

This feature introduces KeyPool, a transparent multi-key rotation and failover system for all API providers. Users can supply multiple API keys per provider via comma-separated environment variables, and Claudish will automatically distribute requests across keys with intelligent failover on errors.

# Before: single key, single point of failure
export GEMINI_API_KEY="key1"

# After: multiple keys, automatic rotation on errors
export GEMINI_API_KEY="key1,key2,key3"

No CLI changes, no configuration flags — just add more keys to the same environment variable.

Motivation

API rate limits (HTTP 429) are a common friction point when using LLM providers heavily. When a key is rate-limited, the user must wait or manually switch keys. Similarly, expired or revoked keys cause hard failures with no recovery path. Multi-key rotation addresses this:

  • Rate limit resilience: When one key hits 429, the next key is tried immediately
  • Key lifecycle management: Expired or invalid keys are skipped automatically
  • Zero-downtime key rotation: Add a new key to the list, remove old ones later
  • Team and CI use cases: Distribute load across multiple API keys to stay within per-key quotas

Architecture

KeyPool Class

KeyPool is the core abstraction, located at handlers/shared/key-pool.ts:

  • Initialization: Parses a comma-separated key string, trims whitespace, filters empty entries
  • Round-robin rotation: Keys are tried in order; the index sticks on success and advances on failure
  • executeWithFailover(fetchFn): The main entry point — wraps a fetch function with automatic key rotation:
    • On rotatable HTTP errors (401, 402, 403, 408, 429, 500–504): rotates to the next key and retries
    • On body-detected invalid keys (non-rotatable status codes like 400): inspects the response body for provider-specific invalid key patterns before giving up
    • On network errors (thrown exceptions): rotates and retries
    • On non-retryable errors (e.g., 400 validation errors): returns immediately without wasting remaining keys
    • On all keys exhausted: returns the last failed response (preserving body for error details) or re-throws the last network error
  • Resource safety: Drains previous response bodies to prevent connection and memory leaks, while preserving the last response for the caller

Response Body-Based Invalid Key Detection

Some providers return non-standard status codes for invalid API keys. For example, Gemini returns HTTP 400 (not normally retryable) with reason: "API_KEY_INVALID" in the response body. The isInvalidApiKeyResponse() method clones the response and inspects the body for known patterns:

Provider Status Detection Pattern
Gemini 400 error.details[].reason === "API_KEY_INVALID"
OpenAI 401 error.code === "invalid_api_key"
Anthropic 401 error.type === "authentication_error"
Generic varies Message contains "API key not valid", "invalid api key", "invalid credentials", "incorrect api key"

This provides defense-in-depth: even if a provider changes their error status code, the body-level check catches it. Non-JSON responses gracefully return false (no false positives).

Handler Integration

All 7 provider handlers have been updated to use KeyPool:

Handler Provider Key Env Var
base-gemini-handler.ts Gemini GEMINI_API_KEY
openai-handler.ts OpenAI OPENAI_API_KEY
anthropic-compat-handler.ts Anthropic, MiniMax, Kimi Provider-specific
openrouter-handler.ts OpenRouter OPENROUTER_API_KEY
ollamacloud-handler.ts OllamaCloud OLLAMA_API_KEY
litellm-handler.ts LiteLLM LITELLM_API_KEY
remote-provider-handler.ts Generic remote providers Registry-defined

Each handler wraps its fetch() call inside keyPool.executeWithFailover(), with the API key injected per-attempt rather than once at construction time.

Backward Compatibility

  • Single key: KeyPool works transparently with a single key (no rotation, no extra logging)
  • No key: Handlers that don't require keys continue to work unchanged
  • Logging: Multi-key rotation is only logged when more than one key is configured, keeping single-key output clean

Test Coverage (116 tests)

Unit Tests (tests/key-pool.test.ts — 1,990 lines)

  • Core KeyPool: Initialization (empty, single, multi-key, whitespace handling), round-robin rotation, index advancement, reset
  • executeWithFailover: Success on first try, failover through keys, all-keys-exhausted (both HTTP and network errors), non-retryable status passthrough, response body draining, mixed error types
  • Status code handling: Every code in ROTATABLE_STATUS_CODES verified, non-rotatable codes confirmed as passthrough
  • Invalid key detection: Every detection pattern (Gemini, OpenAI, Anthropic, generic messages), negative cases (normal 400, non-JSON body), body preservation via clone, all-keys-exhausted with body detection
  • Concurrency resilience: Concurrent executeWithFailover calls, large key pools (10+ keys), key exhaustion ordering

Live Provider Canary Tests (5 tests)

Real HTTP requests to Gemini, OpenAI, Anthropic, OpenRouter, and OllamaCloud with deliberately invalid API keys. Each response is piped through isInvalidApiKeyResponse() to verify the detection logic matches what the provider actually returns. If any provider changes their error format, the corresponding canary test fails — alerting maintainers to update the detection logic. This eliminates the risk of silent regression from provider API contract changes.

Files Changed (37)

  • packages/cli/src/handlers/shared/key-pool.ts — new KeyPool class
  • packages/cli/src/handlers/shared/gemini-retry.ts — Gemini-specific retry integration
  • packages/cli/src/handlers/shared/remote-provider-handler.ts — generic remote provider KeyPool integration
  • packages/cli/src/handlers/*.ts — all 7 handlers updated to use KeyPool
  • tests/key-pool.test.ts — 116 tests (1,990 lines)
  • tests/gemini-retry.test.ts — Gemini retry tests
  • Synced to packages/core/ and src/

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions