Skip to content

Add multi-provider support via LiteLLM, removing hard dependency on O…#165

Open
getmykhan wants to merge 1 commit intokarpathy:masterfrom
getmykhan:feat/multi-provider-litellm
Open

Add multi-provider support via LiteLLM, removing hard dependency on O…#165
getmykhan wants to merge 1 commit intokarpathy:masterfrom
getmykhan:feat/multi-provider-litellm

Conversation

@getmykhan
Copy link
Copy Markdown

Add multi-provider support via LiteLLM, removing hard dependency on OpenRouter

Previously, all LLM calls were routed exclusively through OpenRouter using
raw httpx requests. This change introduces LiteLLM as a unified LLM client,
allowing users to call provider APIs directly (OpenAI, Anthropic, Google
Gemini, xAI, DeepSeek, etc.) by setting the corresponding API key in .env.
OpenRouter is preserved as an automatic fallback for any model that lacks a
direct provider key.

Why this matters:

  • Lower latency: direct API calls skip the OpenRouter proxy
  • Lower cost: no OpenRouter markup on per-token pricing
  • Flexibility: users can mix and match — e.g. use OpenAI directly but
    route xAI through OpenRouter
  • No breaking change: setting only OPENROUTER_API_KEY still works exactly
    as before

How routing works:

  1. resolve_model() checks if a direct API key exists for the model's
    provider (e.g. OPENAI_API_KEY for openai/* models)
  2. If yes, the model is passed to litellm as-is (native API call)
  3. If no, the model is prefixed with "openrouter/" and the provider
    prefix is translated (e.g. gemini/ -> google/, xai/ -> x-ai/) to
    match OpenRouter's naming convention
  4. If neither a direct key nor OPENROUTER_API_KEY is set, a clear
    ValueError is raised telling the user which env var to set

Changes:

  • backend/llm_client.py: New unified LLM client using litellm.acompletion(),
    replacing the httpx-based openrouter.py. Same query_model() and
    query_models_parallel() interface for zero-touch migration.
  • backend/config.py: Added PROVIDER_KEY_MAP, LITELLM_TO_OPENROUTER_PROVIDER,
    resolve_model(), and print_routing_info(). Model identifiers now use
    litellm naming convention (e.g. "gemini/" instead of "google/"). Added
    UTILITY_MODEL config for lightweight tasks (title generation).
  • backend/council.py: Updated imports from openrouter -> llm_client. Title
    generation now uses configurable UTILITY_MODEL instead of hardcoded
    OpenRouter model name.
  • backend/main.py: Migrated from deprecated @app.on_event("startup") to
    FastAPI lifespan context manager. Prints provider routing table on startup.
  • pyproject.toml: Added litellm>=1.60.0 dependency.
  • .env.example: New file documenting all supported provider API keys with
    usage instructions.
  • CLAUDE.md: Updated architecture docs to reflect multi-provider support.
  • backend/openrouter.py: Kept for reference but no longer imported.

…penRouter

Previously, all LLM calls were routed exclusively through OpenRouter using
raw httpx requests. This change introduces LiteLLM as a unified LLM client,
allowing users to call provider APIs directly (OpenAI, Anthropic, Google
Gemini, xAI, DeepSeek, etc.) by setting the corresponding API key in .env.
OpenRouter is preserved as an automatic fallback for any model that lacks a
direct provider key.

Why this matters:
- Lower latency: direct API calls skip the OpenRouter proxy
- Lower cost: no OpenRouter markup on per-token pricing
- Flexibility: users can mix and match providers
- No breaking change: setting only OPENROUTER_API_KEY still works exactly
  as before

Changes:
- backend/llm_client.py: New unified LLM client using litellm.acompletion()
- backend/config.py: Multi-provider routing with resolve_model()
- backend/council.py: Updated imports, configurable UTILITY_MODEL
- backend/main.py: FastAPI lifespan migration, startup routing table
- .env.example: Documents all supported provider API keys
- pyproject.toml: Added litellm>=1.60.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant