Skip to content

bosslesss/inference-labs-mcp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

inference-labs-mcp

Model Context Protocol (MCP) server for Inference Labs. Adds vendor-neutral LLM routing, model comparison, and live LLM pricing to Claude Desktop, Cursor, Windsurf, Continue, and any other MCP-compatible client.

License: Apache 2.0 MCP Python

What it does

After installing, Claude (or whatever MCP client you use) gets four new tools:

Tool Auth required What it does
get_pricing none Returns current per-token pricing for every major 2026 LLM (GPT-5, Claude, Gemini, Bedrock, Llama). Data source: /api/prices.json, CC BY 4.0.
recommend_model none Given a task and a priority (cost / quality / balanced / long-context), returns the top three models with monthly cost estimates.
route_request INFERENCE_LABS_API_KEY Routes one prompt through Inference Labs and returns the response + which model was chosen + cost.
compare_models INFERENCE_LABS_API_KEY Runs the same prompt across N models and returns every response side-by-side.

The first two work without an account — try the server, get value, then sign up at inference-labs.com for the routing tools.

Install

# Recommended: uv (no install, runs on demand)
uvx inference-labs-mcp

# Or pip
pip install inference-labs-mcp
inference-labs-mcp

Claude Desktop

Add this to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "inference-labs": {
      "command": "uvx",
      "args": ["inference-labs-mcp"],
      "env": {
        "INFERENCE_LABS_API_KEY": "il_live_..."
      }
    }
  }
}

Restart Claude Desktop. Type "ask Inference Labs for current LLM pricing" and Claude will call the MCP server.

The INFERENCE_LABS_API_KEY line is optional — without it, get_pricing and recommend_model still work; route_request and compare_models will return a friendly "set INFERENCE_LABS_API_KEY" message.

Cursor / Windsurf / Continue

Same config shape — these editors all read mcpServers from their respective settings files. See the MCP docs for client-specific paths.

Example prompts

Try these in any MCP-enabled client after installing:

  • "Use inference-labs to show me the cheapest 1M-context LLM."
  • "Recommend the best model for summarizing 10M support tickets a month, optimizing for cost."
  • "Use the inference-labs router with cost-first strategy to classify this email: ..."
  • "Run this prompt through GPT-5 and Claude Sonnet 4.5 and tell me which response is better."

Local dev

git clone https://github.com/bosslesss/inference-labs-mcp
cd inference-labs-mcp
pip install -e .
INFERENCE_LABS_API_KEY=il_live_... python -m inference_labs_mcp
# Speaks JSON-RPC over stdio. Use mcp-cli or your MCP client to interact.

License

Apache-2.0. See LICENSE.

Links

About

Model Context Protocol (MCP) server for Inference Labs - adds vendor-neutral LLM routing, model comparison, and live LLM pricing to Claude Desktop, Cursor, Windsurf, and any MCP client.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages