Skip to content

Latest commit

 

History

History
1110 lines (689 loc) · 70.6 KB

File metadata and controls

1110 lines (689 loc) · 70.6 KB

Changelog

[Unreleased]


[2.9.5] — 2026-03-22

Sprint: New OpenCode providers, embedding credentials fix, CLI masked key bug, CACHE_TAG_PATTERN fix.

🐛 Bug Fixes

  • CLI tools save masked API key to config filesclaude-settings, cline-settings, and openclaw-settings POST routes now accept a keyId param and resolve the real API key from DB before writing to disk. ClaudeToolCard updated to send keyId instead of the masked display string. Fixes #523, #526.
  • Custom embedding providers: No credentials error/v1/embeddings now tracks credentialsProviderId separately from the routing prefix, so credentials are fetched from the matching provider node ID rather than the public prefix string. Fixes a regression where google/gemini-embedding-001 and similar custom-provider models would always fail with a credentials error. Fixes #532-related. (PR #528 by @jacob2826)
  • Context cache protection regex misses \n prefixCACHE_TAG_PATTERN in comboAgentMiddleware.ts updated to match both literal \n (backslash-n) and actual newline U+000A that combo.ts streaming injects around the <omniModel> tag after fix #515. Fixes #531.

✨ New Providers

  • OpenCode Zen — Free tier gateway at opencode.ai/zen/v1 with 3 models: minimax-m2.5-free, big-pickle, gpt-5-nano
  • OpenCode Go — Subscription service at opencode.ai/zen/go/v1 with 4 models: glm-5, kimi-k2.5, minimax-m2.7 (Claude format), minimax-m2.5 (Claude format)
  • Both providers use the new OpencodeExecutor which routes dynamically to /chat/completions, /messages, /responses, or /models/{model}:generateContent based on the requested model. (PR #530 by @kang-heewon)

[2.9.4] — 2026-03-21

Sprint: Bug fixes — preserve Codex prompt cache key, fix tagContent JSON escaping, sync expired token status to DB.

🐛 Bug Fixes

  • fix(translator): Preserve prompt_cache_key in Responses API → Chat Completions translation (#517) — The field is a cache-affinity signal used by Codex; stripping it was preventing prompt cache hits. Fixed in openai-responses.ts and responsesApiHelper.ts.

  • fix(combo): Escape \n in tagContent so injected JSON string is valid (#515) — Template literal newlines (U+000A) are not allowed unescaped inside JSON string values. Replaced with \\n literal sequences in open-sse/services/combo.ts.

  • fix(usage): Sync expired token status back to DB on live auth failure (#491) — When the Limits & Quotas live check returns 401/403, the connection testStatus is now updated to "expired" in the database so the Providers page reflects the same degraded state. Fixed in src/app/api/usage/[connectionId]/route.ts.


[2.9.3] — 2026-03-21

Sprint: Add 5 new free AI providers — LongCat, Pollinations, Cloudflare AI, Scaleway, AI/ML API.

✨ New Providers

  • feat(providers/longcat): Add LongCat AI (lc/) — 50M tokens/day free (Flash-Lite) + 500K/day (Chat/Thinking) during public beta. OpenAI-compatible, standard Bearer auth.
  • feat(providers/pollinations): Add Pollinations AI (pol/) — no API key required. Proxies GPT-5, Claude, Gemini, DeepSeek V3, Llama 4 (1 req/15s free). Custom executor handles optional auth.
  • feat(providers/cloudflare-ai): Add Cloudflare Workers AI (cf/) — 10K Neurons/day free (~150 LLM responses or 500s Whisper audio). 50+ models on global edge. Custom executor builds dynamic URL with accountId from credentials.
  • feat(providers/scaleway): Add Scaleway Generative APIs (scw/) — 1M free tokens for new accounts. EU/GDPR compliant (Paris). Qwen3 235B, Llama 3.1 70B, Mistral Small 3.2.
  • feat(providers/aimlapi): Add AI/ML API (aiml/) — $0.025/day free credit, 200+ models (GPT-4o, Claude, Gemini, Llama) via single aggregator endpoint.

🔄 Provider Updates

  • feat(providers/together): Add hasFree: true + 3 permanently free model IDs: Llama-3.3-70B-Instruct-Turbo-Free, Llama-Vision-Free, DeepSeek-R1-Distill-Llama-70B-Free
  • feat(providers/gemini): Add hasFree: true + freeNote (1,500 req/day, no credit card needed, aistudio.google.com)
  • chore(providers/gemini): Rename display name to Gemini (Google AI Studio) for clarity

⚙️ Infrastructure

  • feat(executors/pollinations): New PollinationsExecutor — omits Authorization header when no API key provided
  • feat(executors/cloudflare-ai): New CloudflareAIExecutor — dynamic URL construction requires accountId in provider credentials
  • feat(executors): Register pollinations, pol, cloudflare-ai, cf executor mappings

📝 Documentation

  • docs(readme): Expanded free combo stack to 11 providers ($0 forever)
  • docs(readme): Added 4 new free provider sections (LongCat, Pollinations, Cloudflare AI, Scaleway) with model tables
  • docs(readme): Updated pricing table with 4 new free tier rows
  • docs(i18n/pt-BR): Updated pricing table + added LongCat/Pollinations/Cloudflare AI/Scaleway sections in Portuguese
  • docs(new-features/ai): 10 task spec files + master implementation plan in docs/new-features/ai/

🧪 Tests

  • Test suite: 821 tests, 0 failures (unchanged)

[2.9.2] — 2026-03-21

Sprint: Fix media transcription (Deepgram/HuggingFace Content-Type, language detection) and TTS error display.

🐛 Bug Fixes

  • fix(transcription): Deepgram and HuggingFace audio transcription now correctly map video/mp4audio/mp4 and other media MIME types via new resolveAudioContentType() helper. Previously, uploading .mp4 files consistently returned "No speech detected" because Deepgram was receiving Content-Type: video/mp4.
  • fix(transcription): Added detect_language=true to Deepgram requests — auto-detects audio language (Portuguese, Spanish, etc.) instead of defaulting to English. Fixes non-English transcriptions returning empty or garbage results.
  • fix(transcription): Added punctuate=true to Deepgram requests for higher-quality transcription output with correct punctuation.
  • fix(tts): [object Object] error display in Text-to-Speech responses fixed in both audioSpeech.ts and audioTranscription.ts. The upstreamErrorResponse() function now correctly extracts nested string messages from providers like ElevenLabs that return { error: { message: "...", status_code: 401 } } instead of a flat error string.

🧪 Tests

  • Test suite: 821 tests, 0 failures (unchanged)

Triaged Issues

  • #508 — Tool call format regression: requested proxy logs and provider chain info (needs-info)
  • #510 — Windows CLI healthcheck path: requested shell/Node version info (needs-info)
  • #485 — Kiro MCP tool calls: closed as external Kiro issue (not OmniRoute)
  • #442 — Baseten /models endpoint: closed (documented manual workaround)
  • #464 — Key provisioning API: acknowledged as roadmap item

[2.9.1] — 2026-03-21

Sprint: Fix SSE omniModel data loss, merge per-protocol model compatibility.

Bug Fixes

  • #511 — Critical: <omniModel> tag was sent after finish_reason:stop in SSE streams, causing data loss. Tag is now injected into the first non-empty content chunk, guaranteeing delivery before SDKs close the connection.

Merged PRs

  • PR #512 (@zhangqiang8vip): Per-protocol model compatibility — normalizeToolCallId and preserveOpenAIDeveloperRole can now be configured per client protocol (OpenAI, Claude, Responses API). New compatByProtocol field in model config with Zod validation.

Triaged Issues

  • #510 — Windows CLI healthcheck_failed: requested PATH/version info
  • #509 — Turbopack Electron regression: upstream Next.js bug, documented workarounds
  • #508 — macOS black screen: suggested --disable-gpu workaround

[2.9.0] — 2026-03-20

Sprint: Cross-platform machineId fix, per-API-key rate limits, streaming context cache, Alibaba DashScope, search analytics, ZWS v5, and 8 issues closed.

✨ New Features

  • feat(search): Search Analytics tab in /dashboard/analytics — provider breakdown, cache hit rate, cost tracking. New API: GET /api/v1/search/analytics (#feat/search-provider-routing)
  • feat(provider): Alibaba Cloud DashScope added with custom endpoint path validation — configurable chatPath and modelsPath per node (#feat/custom-endpoint-paths)
  • feat(api): Per-API-key request-count limits — max_requests_per_day and max_requests_per_minute columns with in-memory sliding-window enforcement returning HTTP 429 (#452)
  • feat(dev): ZWS v5 — HMR leak fix (485 DB connections → 1), memory 2.4GB → 195MB, globalThis singletons, Edge Runtime warning fix (@zhangqiang8vip)

🐛 Bug Fixes

  • fix(#506): Cross-platform machineIdgetMachineIdRaw() rewritten with try/catch waterfall (Windows REG.exe → macOS ioreg → Linux file read → hostname → os.hostname()). Eliminates process.platform branching that Next.js bundler dead-code-eliminated, fixing 'head' is not recognized on Windows. Also fixes #466.
  • fix(#493): Custom provider model naming — removed incorrect prefix stripping in DefaultExecutor.transformRequest() that mangled org-scoped model IDs like zai-org/GLM-5-FP8.
  • fix(#490): Streaming + context cache protection — TransformStream intercepts SSE to inject <omniModel> tag before [DONE] marker, enabling context cache protection for streaming responses.
  • fix(#458): Combo schema validation — system_message, tool_filter_regex, context_cache_protection fields now pass Zod validation on save.
  • fix(#487): KIRO MITM card cleanup — removed ZWS_README, generified AntigravityToolCard to use dynamic tool metadata.

🧪 Tests

  • Added Anthropic-format tools filter unit tests (PR #397) — 8 regression tests for tool.name without .function wrapper
  • Test suite: 821 tests, 0 failures (up from 813)

📋 Issues Closed (8)

  • #506 — Windows machineId head not recognized (fixed)
  • #493 — Custom provider model naming (fixed)
  • #490 — Streaming context cache (fixed)
  • #452 — Per-API-key request limits (implemented)
  • #466 — Windows login failure (same root cause as #506)
  • #504 — MITM inactive (expected behavior)
  • #462 — Gemini CLI PSA (resolved)
  • #434 — Electron app crash (duplicate of #402)

[2.8.9] — 2026-03-20

Sprint: Merge community PRs, fix KIRO MITM card, dependency updates.

Merged PRs

  • PR #498 (@Sajid11194): Fix Windows machine ID crash (undefined\REG.exe). Replaces node-machine-id with native OS registry queries. Closes #486.
  • PR #497 (@zhangqiang8vip): Fix dev-mode HMR resource leaks — 485 leaked DB connections → 1, memory 2.4GB → 195MB. globalThis singletons, Edge Runtime warning fix, Windows test stability. (+1168/-338 across 22 files)
  • PRs #499-503 (Dependabot): GitHub Actions updates — docker/build-push-action@7, actions/checkout@6, peter-evans/dockerhub-description@5, docker/setup-qemu-action@4, docker/login-action@4.

Bug Fixes

  • #505 — KIRO MITM card now displays tool-specific instructions (api.anthropic.com) instead of Antigravity-specific text.
  • #504 — Responded with UX clarification (MITM "Inactive" is expected behavior when proxy is not running).

[2.8.8] — 2026-03-20

Sprint: Fix OAuth batch test crash, add "Test All" button to individual provider pages.

Bug Fixes

  • OAuth batch test crash (ERR_CONNECTION_REFUSED): Replaced sequential for-loop with 5-connection concurrency limit + 30s per-connection timeout via Promise.race() + Promise.allSettled(). Prevents server crash when testing large OAuth provider groups (~30+ connections).

Features

  • "Test All" button on provider pages: Individual provider pages (e.g., /providers/codex) now show a "Test All" button in the Connections header when there are 2+ connections. Uses POST /api/providers/test-batch with {mode: "provider", providerId}. Results displayed in a modal with pass/fail summary and per-connection diagnosis.

[2.8.7] — 2026-03-20

Sprint: Merge PR #495 (Bottleneck 429 drop), fix #496 (custom embedding providers), triage features.

Bug Fixes

  • Bottleneck 429 infinite wait (PR #495 by @xandr0s): On 429, limiter.stop({ dropWaitingJobs: true }) immediately fails all queued requests so upstream callers can trigger fallback. Limiter is deleted from Map so next request creates a fresh instance.
  • Custom embedding models unresolvable (#496): POST /v1/embeddings now resolves custom embedding models from ALL provider_nodes (not just localhost). Enables models like google/gemini-embedding-001 added via dashboard.

Issues Responded

  • #452 — Per-API-key request-count limits (acknowledged, on roadmap)
  • #464 — Auto-issue API keys with provider/account limits (needs more detail)
  • #488 — Auto-update model lists (acknowledged, on roadmap)
  • #496 — Custom embedding provider resolution (fixed)

[2.8.6] — 2026-03-20

Sprint: Merge PR #494 (MiniMax role fix), fix KIRO MITM dashboard, triage 8 issues.

Features

  • MiniMax developer→system role fix (PR #494 by @zhangqiang8vip): Per-model preserveDeveloperRole toggle. Adds "Compatibility" UI in providers page. Fixes 422 "role param error" for MiniMax and similar gateways.
  • roleNormalizer: normalizeDeveloperRole() now accepts preserveDeveloperRole parameter with tri-state behavior (undefined=keep, true=keep, false=convert).
  • DB: New getModelPreserveOpenAIDeveloperRole() and mergeModelCompatOverride() in models.ts.

Bug Fixes

  • KIRO MITM dashboard (#481/#487): CLIToolsPageClient now routes any configType: "mitm" tool to AntigravityToolCard (MITM Start/Stop controls). Previously only Antigravity was hardcoded.
  • AntigravityToolCard generic: Uses tool.image, tool.description, tool.id instead of hardcoded Antigravity values. Guards against missing defaultModels.

Cleanup

  • Removed ZWS_README_V2.md (development-only docs from PR #494).

Issues Triaged (8)

  • #487 — Closed (KIRO MITM fixed in this release)
  • #486 — needs-info (Windows REG.exe PATH issue)
  • #489 — needs-info (Antigravity projectId missing, OAuth reconnect needed)
  • #492 — needs-info (missing app/server.js on mise-managed Node)
  • #490 — Acknowledged (streaming + context cache blocking, fix planned)
  • #491 — Acknowledged (Codex auth state inconsistency)
  • #493 — Acknowledged (Modal provider model name prefix, workaround provided)
  • #488 — Feature request backlog (auto-update model lists)

[2.8.5] — 2026-03-19

Sprint: Fix zombie SSE streams, context cache first-turn, KIRO MITM, and triage 5 external issues.

Bug Fixes

  • Zombie SSE Streams (#473): Reduce STREAM_IDLE_TIMEOUT_MS from 300s → 120s for faster combo fallback when providers hang mid-stream. Configurable via env var.
  • Context Cache Tag (#474): Fix injectModelTag() to handle first-turn requests (no assistant messages) — context cache protection now works from the very first response.
  • KIRO MITM (#481): Change KIRO configType from guidemitm so the dashboard renders MITM Start/Stop controls.
  • E2E Test (CI): Fix providers-bailian-coding-plan.spec.ts — dismiss pre-existing modal overlay before clicking Add API Key button.

Closed Issues

  • #473 — Zombie SSE streams bypass combo fallback
  • #474 — Context cache <omniModel> tag missing on first turn
  • #481 — MITM for KIRO not activatable from dashboard
  • #468 — Gemini CLI remote server (superseded by #462 deprecation)
  • #438 — Claude unable to write files (external CLI issue)
  • #439 — AppImage doesn't work (documented libfuse2 workaround)
  • #402 — ARM64 DMG "damaged" (documented xattr -cr workaround)
  • #460 — CLI not runnable on Windows (documented PATH fix)

[2.8.4] — 2026-03-19

Sprint: Gemini CLI deprecation, VM guide i18n fix, dependabot security fix, provider schema expansion.

Features

  • Gemini CLI Deprecation (#462): Mark gemini-cli provider as deprecated with warning — Google restricts third-party OAuth usage from March 2026
  • Provider Schema (#462): Expand Zod validation with deprecated, deprecationReason, hasFree, freeNote, authHint, apiHint optional fields

Bug Fixes

  • VM Guide i18n (#471): Add VM_DEPLOYMENT_GUIDE.md to i18n translation pipeline, regenerate all 30 locale translations from English source (were stuck in Portuguese)

Security

  • deps: Bump flatted 3.3.3 → 3.4.2 — fixes CWE-1321 prototype pollution (#484, @dependabot)

Closed Issues

  • #472 — Model Aliases regression (fixed in v2.8.2)
  • #471 — VM guide translations broken
  • #483 — Trailing data: null after [DONE] (fixed in v2.8.3)

Merged PRs

  • #484 — deps: bump flatted from 3.3.3 to 3.4.2 (@dependabot)

[2.8.3] — 2026-03-19

Sprint: Czech i18n, SSE protocol fix, VM guide translation.

Features

  • Czech Language (#482): Full Czech (cs) i18n — 22 docs, 2606 UI strings, language switcher updates (@zen0bit)
  • VM Deployment Guide: Translated from Portuguese to English as the source document (@zen0bit)

Bug Fixes

  • SSE Protocol (#483): Stop sending trailing data: null after [DONE] signal — fixes AI_TypeValidationError in strict AI SDK clients (Zod-based validators)

Merged PRs

  • #482 — Add Czech language + Fix VM_DEPLOYMENT_GUIDE.md English source (@zen0bit)

[2.8.2] — 2026-03-19

Sprint: 2 merged PRs, model aliases routing fix, log export, and issue triage.

Features

  • Log Export: New Export button on /dashboard/logs with time range dropdown (1h, 6h, 12h, 24h). Downloads JSON of request/proxy/call logs via /api/logs/export API (#user-request)

Bug Fixes

  • Model Aliases Routing (#472): Settings → Model Aliases now correctly affect provider routing, not just format detection. Previously resolveModelAlias() output was only used for getModelTargetFormat() but the original model ID was sent to the provider
  • Stream Flush Usage (#480): Usage data from the last SSE event in the buffer is now correctly extracted during stream flush (merged from @prakersh)

Merged PRs

  • #480 — Extract usage from remaining buffer in flush handler (@prakersh)
  • #479 — Add missing Codex 5.3/5.4 and Anthropic model ID pricing entries (@prakersh)

[2.8.1] — 2026-03-19

Sprint: Five community PRs — streaming call log fixes, Kiro compatibility, cache token analytics, Chinese translation, and configurable tool call IDs.

✨ Features

  • feat(logs): Call log response content now correctly accumulated from raw provider chunks (OpenAI/Claude/Gemini) before translation, fixing empty response payloads in streaming mode (#470, @zhangqiang8vip)
  • feat(providers): Per-model configurable 9-char tool call ID normalization (Mistral-style) — only models with the option enabled get truncated IDs (#470)
  • feat(api): Key PATCH API expanded to support allowedConnections, name, autoResolve, isActive, and accessSchedule fields (#470)
  • feat(dashboard): Response-first layout in request log detail UI (#470)
  • feat(i18n): Improved Chinese (zh-CN) translation — complete retranslation (#475, @only4copilot)

🐛 Bug Fixes

  • fix(kiro): Strip injected model field from request body — Kiro API rejects unknown top-level fields (#478, @prakersh)
  • fix(usage): Include cache read + cache creation tokens in usage history input totals for accurate analytics (#477, @prakersh)
  • fix(callLogs): Support Claude format usage fields (input_tokens/output_tokens) alongside OpenAI format, include all cache token variants (#476, @prakersh)

[2.8.0] — 2026-03-19

Sprint: Bailian Coding Plan provider with editable base URLs, plus community contributions for Alibaba Cloud and Kimi Coding.

✨ Features

  • feat(providers): Added Bailian Coding Plan (bailian-coding-plan) — Alibaba Model Studio with Anthropic-compatible API. Static catalog of 8 models including Qwen3.5 Plus, Qwen3 Coder, MiniMax M2.5, GLM 5, and Kimi K2.5. Includes custom auth validation (400=valid, 401/403=invalid) (#467, @Mind-Dragon)
  • feat(admin): Editable default URL in Provider Admin create/edit flows — users can configure custom base URLs per connection. Persisted in providerSpecificData.baseUrl with Zod schema validation rejecting non-http(s) schemes (#467)

🧪 Tests

  • Added 30+ unit tests and 2 e2e scenarios for Bailian Coding Plan provider covering auth validation, schema hardening, route-level behavior, and cross-layer integration

[2.7.10] — 2026-03-19

Sprint: Two new community-contributed providers (Alibaba Cloud Coding, Kimi Coding API-key) and Docker pino fix.

✨ Features

  • feat(providers): Added Alibaba Cloud Coding Plan support with two OpenAI-compatible endpoints — alicode (China) and alicode-intl (International), each with 8 models (#465, @dtk1985)
  • feat(providers): Added dedicated kimi-coding-apikey provider path — API-key-based Kimi Coding access is no longer forced through OAuth-only kimi-coding route. Includes registry, constants, models API, config, and validation test (#463, @Mind-Dragon)

🐛 Bug Fixes

  • fix(docker): Added missing split2 dependency to Docker image — pino-abstract-transport requires it at runtime but it was not being copied into the standalone container, causing Cannot find module 'split2' crashes (#459)

[2.7.9] — 2026-03-18

Sprint: Codex responses subpath passthrough natively supported, Windows MITM crash fixed, and Combos agent schemas adjusted.

✨ Features

  • feat(codex): Native responses subpath passthrough for Codex — natively routes POST /v1/responses/compact to Codex upstream, maintaining Claude Code compatibility without stripping the /compact suffix (#457)

🐛 Bug Fixes

  • fix(combos): Zod schemas (updateComboSchema and createComboSchema) now include system_message, tool_filter_regex, and context_cache_protection. Fixes bug where agent-specific settings created via the dashboard were silently discarded by the backend validation layer (#458)
  • fix(mitm): Kiro MITM profile crash on Windows fixed — node-machine-id failed due to missing REG.exe env, and the fallback threw a fatal crypto is not defined error. Fallback now safely and correctly imports crypto (#456)

[2.7.8] — 2026-03-18

Sprint: Budget save bug + combo agent features UI + omniModel tag security fix.

🐛 Bug Fixes

  • fix(budget): "Save Limits" no longer returns 422 — warningThreshold is now correctly sent as fraction (0–1) instead of percentage (0–100) (#451)
  • fix(combos): <omniModel> internal cache tag is now stripped before forwarding requests to providers, preventing cache session breaks (#454)

✨ Features

  • feat(combos): Agent Features section added to combo create/edit modal — expose system_message override, tool_filter_regex, and context_cache_protection directly from the dashboard (#454)

[2.7.7] — 2026-03-18

Sprint: Docker pino crash, Codex CLI responses worker fix, package-lock sync.

🐛 Bug Fixes

  • fix(docker): pino-abstract-transport and pino-pretty now explicitly copied in Docker runner stage — Next.js standalone trace misses these peer deps, causing Cannot find module pino-abstract-transport crash on startup (#449)
  • fix(responses): Remove initTranslators() from /v1/responses route — was crashing Next.js worker with the worker has exited uncaughtException on Codex CLI requests (#450)

🔧 Maintenance

  • chore(deps): package-lock.json now committed on every version bump to ensure Docker npm ci uses exact dependency versions

[2.7.5] — 2026-03-18

Sprint: UX improvements and Windows CLI healthcheck fix.

🐛 Bug Fixes

  • fix(ux): Show default password hint on login page — new users now see "Default password: 123456" below the password input (#437)
  • fix(cli): Claude CLI and other npm-installed tools now correctly detected as runnable on Windows — spawn uses shell:true to resolve .cmd wrappers via PATHEXT (#447)

[2.7.4] — 2026-03-18

Sprint: Search Tools dashboard, i18n fixes, Copilot limits, Serper validation fix.

🚀 Features

  • feat(search): Add Search Playground (10th endpoint), Search Tools page with Compare Providers/Rerank Pipeline/Search History, local rerank routing, auth guards on search API (#443 by @Regis-RCR)
    • New route: /dashboard/search-tools
    • Sidebar entry under Debug section
    • GET /api/search/providers and GET /api/search/stats with auth guards
    • Local provider_nodes routing for /v1/rerank
    • 30+ i18n keys in search namespace

🐛 Bug Fixes

  • fix(search): Fix Brave news normalizer (was returning 0 results), enforce max_results truncation post-normalization, fix Endpoints page fetch URL (#443 by @Regis-RCR)
  • fix(analytics): Localize analytics day/date labels — replace hardcoded Portuguese strings with Intl.DateTimeFormat(locale) (#444 by @hijak)
  • fix(copilot): Correct GitHub Copilot account type display, filter misleading unlimited quota rows from limits dashboard (#445 by @hijak)
  • fix(providers): Stop rejecting valid Serper API keys — treat non-4xx responses as valid authentication (#446 by @hijak)

[2.7.3] — 2026-03-18

Sprint: Codex direct API quota fallback fix.

🐛 Bug Fixes

  • fix(codex): Block weekly-exhausted accounts in direct API fallback (#440)
    • resolveQuotaWindow() prefix matching: "weekly" now matches "weekly (7d)" cache keys
    • applyCodexWindowPolicy() enforces useWeekly/use5h toggles correctly
    • 4 new regression tests (766 total)

[2.7.2] — 2026-03-18

Sprint: Light mode UI contrast fixes.

🐛 Bug Fixes

  • fix(logs): Fix light mode contrast in request logs filter buttons and combo badge (#378)
    • Error/Success/Combo filter buttons now readable in light mode
    • Combo row badge uses stronger violet in light mode

[2.7.1] — 2026-03-17

Sprint: Unified web search routing (POST /v1/search) with 5 providers + Next.js 16.1.7 security fixes (6 CVEs).

✨ New Features

  • feat(search): Unified web search routing — POST /v1/search with 5 providers (Serper, Brave, Perplexity, Exa, Tavily)
    • Auto-failover across providers, 6,500+ free searches/month
    • In-memory cache with request coalescing (configurable TTL)
    • Dashboard: Search Analytics tab in /dashboard/analytics with provider breakdown, cache hit rate, cost tracking
    • New API: GET /api/v1/search/analytics for search request statistics
    • DB migration: request_type column on call_logs for non-chat request tracking
    • Zod validation (v1SearchSchema), auth-gated, cost recorded via recordCost()

🔒 Security

  • deps: Next.js 16.1.6 → 16.1.7 — fixes 6 CVEs:
    • Critical: CVE-2026-29057 (HTTP request smuggling via http-proxy)
    • High: CVE-2026-27977, CVE-2026-27978 (WebSocket + Server Actions)
    • Medium: CVE-2026-27979, CVE-2026-27980, CVE-2026-jcc7

📁 New Files

File Purpose
open-sse/handlers/search.ts Search handler with 5-provider routing
open-sse/config/searchRegistry.ts Provider registry (auth, cost, quota, TTL)
open-sse/services/searchCache.ts In-memory cache with request coalescing
src/app/api/v1/search/route.ts Next.js route (POST + GET)
src/app/api/v1/search/analytics/route.ts Search stats API
src/app/(dashboard)/dashboard/analytics/SearchAnalyticsTab.tsx Analytics dashboard tab
src/lib/db/migrations/007_search_request_type.sql DB migration
tests/unit/search-registry.test.mjs 277 lines of unit tests

[2.7.0] — 2026-03-17

Sprint: ClawRouter-inspired features — toolCalling flag, multilingual intent detection, benchmark-driven fallback, request deduplication, pluggable RouterStrategy, Grok-4 Fast + GLM-5 + MiniMax M2.5 + Kimi K2.5 pricing.

✨ New Models & Pricing

  • feat(pricing): xAI Grok-4 Fast — $0.20/$0.50 per 1M tokens, 1143ms p50 latency, tool calling supported
  • feat(pricing): xAI Grok-4 (standard) — $0.20/$1.50 per 1M tokens, reasoning flagship
  • feat(pricing): GLM-5 via Z.AI — $0.5/1M, 128K output context
  • feat(pricing): MiniMax M2.5 — $0.30/1M input, reasoning + agentic tasks
  • feat(pricing): DeepSeek V3.2 — updated pricing $0.27/$1.10 per 1M
  • feat(pricing): Kimi K2.5 via Moonshot API — direct Moonshot API access
  • feat(providers): Z.AI provider added (zai alias) — GLM-5 family with 128K output

🧠 Routing Intelligence

  • feat(registry): toolCalling flag per model in provider registry — combos can now prefer/require tool-calling capable models
  • feat(scoring): Multilingual intent detection for AutoCombo scoring — PT/ZH/ES/AR script/language patterns influence model selection per request context
  • feat(fallback): Benchmark-driven fallback chains — real latency data (p50 from comboMetrics) used to re-order fallback priority dynamically
  • feat(dedup): Request deduplication via content-hash — 5-second idempotency window prevents duplicate provider calls from retrying clients
  • feat(router): Pluggable RouterStrategy interface in autoCombo/routerStrategy.ts — custom routing logic can be injected without modifying core

🔧 MCP Server Improvements

  • feat(mcp): 2 new advanced tool schemas: omniroute_get_provider_metrics (p50/p95/p99 per provider) and omniroute_explain_route (routing decision explanation)
  • feat(mcp): MCP tool auth scopes updated — metrics:read scope added for provider metrics tools
  • feat(mcp): omniroute_best_combo_for_task now accepts languageHint parameter for multilingual routing

📊 Observability

  • feat(metrics): comboMetrics.ts extended with real-time latency percentile tracking per provider/account
  • feat(health): Health API (/api/monitoring/health) now returns per-provider p50Latency and errorRate fields
  • feat(usage): Usage history migration for per-model latency tracking

🗄️ DB Migrations

  • feat(migrations): New column latency_p50 in combo_metrics table — zero-breaking, safe for existing users

🐛 Bug Fixes / Closures

  • close(#411): better-sqlite3 hashed module resolution on Windows — fixed in v2.6.10 (f02c5b5)
  • close(#409): GitHub Copilot chat completions fail with Claude models when files attached — fixed in v2.6.9 (838f1d6)
  • close(#405): Duplicate of #411 — resolved

[2.6.10] — 2026-03-17

Windows fix: better-sqlite3 prebuilt download without node-gyp/Python/MSVC (#426).

🐛 Bug Fixes

  • fix(install/#426): On Windows, npm install -g omniroute used to fail with better_sqlite3.node is not a valid Win32 application because the bundled native binary was compiled for Linux. Adds Strategy 1.5 to scripts/postinstall.mjs: uses @mapbox/node-pre-gyp install --fallback-to-build=false (bundled within better-sqlite3) to download the correct prebuilt binary for the current OS/arch without requiring any build tools (no node-gyp, no Python, no MSVC). Falls back to npm rebuild only if the download fails. Adds platform-specific error messages with clear manual fix instructions.

[2.6.9] — 2026-03-17

CI fixes (t11 any-budget), bug fix #409 (file attachments via Copilot+Claude), release workflow correction.

🐛 Bug Fixes

  • fix(ci): Remove word "any" from comments in openai-responses.ts and chatCore.ts that were failing the t11 \bany\b budget check (false positive from regex counting comments)
  • fix(chatCore): Normalize unsupported content part types before forwarding to providers (#409 — Cursor sends {type:"file"} when .md files are attached; Copilot and other OpenAI-compat providers reject with "type has to be either 'image_url' or 'text'"; fix converts file/document blocks to text and drops unknown types)

🔧 Workflow

  • chore(generate-release): Add ATOMIC COMMIT RULE — version bump (npm version patch) MUST happen before committing feature files to ensure tag always points to a commit containing all version changes together

[2.6.8] — 2026-03-17

Sprint: Combo as Agent (system prompt + tool filter), Context Caching Protection, Auto-Update, Detailed Logs, MITM Kiro IDE.

🗄️ DB Migrations (zero-breaking — safe for existing users)

  • 005_combo_agent_fields.sql: ALTER TABLE combos ADD COLUMN system_message TEXT DEFAULT NULL, tool_filter_regex TEXT DEFAULT NULL, context_cache_protection INTEGER DEFAULT 0
  • 006_detailed_request_logs.sql: New request_detail_logs table with 500-entry ring-buffer trigger, opt-in via settings toggle

✨ Features

  • feat(combo): System Message Override per Combo (#399 — system_message field replaces or injects system prompt before forwarding to provider)
  • feat(combo): Tool Filter Regex per Combo (#399 — tool_filter_regex keeps only tools matching pattern; supports OpenAI + Anthropic formats)
  • feat(combo): Context Caching Protection (#401 — context_cache_protection tags responses with <omniModel>provider/model</omniModel> and pins model for session continuity)
  • feat(settings): Auto-Update via Settings (#320 — GET /api/system/version + POST /api/system/update — checks npm registry and updates in background with pm2 restart)
  • feat(logs): Detailed Request Logs (#378 — captures full pipeline bodies at 4 stages: client request, translated request, provider response, client response — opt-in toggle, 64KB trim, 500-entry ring-buffer)
  • feat(mitm): MITM Kiro IDE profile (#336 — src/mitm/targets/kiro.ts targets api.anthropic.com, reuses existing MITM infrastructure)

[2.6.7] — 2026-03-17

Sprint: SSE improvements, local provider_nodes extensions, proxy registry, Claude passthrough fixes.

✨ Features

  • feat(health): Background health check for local provider_nodes with exponential backoff (30s→300s) and Promise.allSettled to avoid blocking (#423, @Regis-RCR)
  • feat(embeddings): Route /v1/embeddings to local provider_nodesbuildDynamicEmbeddingProvider() with hostname validation (#422, @Regis-RCR)
  • feat(audio): Route TTS/STT to local provider_nodesbuildDynamicAudioProvider() with SSRF protection (#416, @Regis-RCR)
  • feat(proxy): Proxy registry, management APIs, and quota-limit generalization (#429, @Regis-RCR)

🐛 Bug Fixes

  • fix(sse): Strip Claude-specific fields (metadata, anthropic_version) when target is OpenAI-compat (#421, @prakersh)
  • fix(sse): Extract Claude SSE usage (input_tokens, output_tokens, cache tokens) in passthrough stream mode (#420, @prakersh)
  • fix(sse): Generate fallback call_id for tool calls with missing/empty IDs (#419, @prakersh)
  • fix(sse): Claude-to-Claude passthrough — forward body completely untouched, no re-translation (#418, @prakersh)
  • fix(sse): Filter orphaned tool_result items after Claude Code context compaction to avoid 400 errors (#417, @prakersh)
  • fix(sse): Skip empty-name tool calls in Responses API translator to prevent placeholder_tool infinite loops (#415, @prakersh)
  • fix(sse): Strip empty text content blocks before translation (#427, @prakersh)
  • fix(api): Add refreshable: true to Claude OAuth test config (#428, @prakersh)

📦 Dependencies

  • Bump vitest, @vitest/* and related devDependencies (#414, @dependabot)

[2.6.6] — 2026-03-17

Hotfix: Turbopack/Docker compatibility — remove node: protocol from all src/ imports.

🐛 Bug Fixes

  • fix(build): Removed node: protocol prefix from import statements in 17 files under src/. The node:fs, node:path, node:url, node:os etc. imports caused Ecmascript file had an error on Turbopack builds (Next.js 15 Docker) and on upgrades from older npm global installs. Affected files: migrationRunner.ts, core.ts, backup.ts, prompts.ts, dataPaths.ts, and 12 others in src/app/api/ and src/lib/.
  • chore(workflow): Updated generate-release.md to make Docker Hub sync and dual-VPS deploy mandatory steps in every release.

[2.6.5] — 2026-03-17

Sprint: reasoning model param filtering, local provider 404 fix, Kilo Gateway provider, dependency bumps.

✨ New Features

  • feat(api): Added Kilo Gateway (api.kilo.ai) as a new API Key provider (alias kg) — 335+ models, 6 free models, 3 auto-routing models (kilo-auto/frontier, kilo-auto/balanced, kilo-auto/free). Passthrough models supported via /api/gateway/models endpoint. (PR #408 by @Regis-RCR)

🐛 Bug Fixes

  • fix(sse): Strip unsupported parameters for reasoning models (o1, o1-mini, o1-pro, o3, o3-mini). Models in the o1/o3 family reject temperature, top_p, frequency_penalty, presence_penalty, logprobs, top_logprobs, and n with HTTP 400. Parameters are now stripped at the chatCore layer before forwarding. Uses a declarative unsupportedParams field per model and a precomputed O(1) Map for lookup. (PR #412 by @Regis-RCR)
  • fix(sse): Local provider 404 now results in a model-only lockout (5 seconds) instead of a connection-level lockout (2 minutes). When a local inference backend (Ollama, LM Studio, oMLX) returns 404 for an unknown model, the connection remains active and other models continue working immediately. Also fixes a pre-existing bug where model was not passed to markAccountUnavailable(). Local providers detected via hostname (localhost, 127.0.0.1, ::1, extensible via LOCAL_HOSTNAMES env var). (PR #410 by @Regis-RCR)

📦 Dependencies

  • better-sqlite3 12.6.2 → 12.8.0
  • undici 7.24.2 → 7.24.4
  • https-proxy-agent 7 → 8
  • agent-base 7 → 8

[2.6.4] — 2026-03-17

🐛 Bug Fixes

  • fix(providers): Removed non-existent model names across 5 providers:
    • gemini / gemini-cli: removed gemini-3.1-pro/flash and gemini-3-*-preview (don't exist in Google API v1beta); replaced with gemini-2.5-pro, gemini-2.5-flash, gemini-2.0-flash, gemini-1.5-pro/flash
    • antigravity: removed gemini-3.1-pro-high/low and gemini-3-flash (invalid internal aliases); replaced with real 2.x models
    • github (Copilot): removed gemini-3-flash-preview and gemini-3-pro-preview; replaced with gemini-2.5-flash
    • nvidia: corrected nvidia/llama-3.3-70b-instructmeta/llama-3.3-70b-instruct (NVIDIA NIM uses meta/ namespace for Meta models); added nvidia/llama-3.1-70b-instruct and nvidia/llama-3.1-405b-instruct
  • fix(db/combo): Updated free-stack combo on remote DB: removed qw/qwen3-coder-plus (expired refresh token), corrected nvidia/llama-3.3-70b-instructnvidia/meta/llama-3.3-70b-instruct, corrected gemini/gemini-3.1-flashgemini/gemini-2.5-flash, added if/deepseek-v3.2

[2.6.3] — 2026-03-16

Sprint: zod/pino hash-strip baked into build pipeline, Synthetic provider added, VPS PM2 path corrected.

🐛 Bug Fixes

  • fix(build): Turbopack hash-strip now runs at compile time for ALL packages — not just better-sqlite3. Step 5.6 in prepublish.mjs walks every .js in app/.next/server/ and strips the 16-char hex suffix from any hashed require(). Fixes zod-dcb22c..., pino-..., etc. MODULE_NOT_FOUND on global npm installs. Closes #398
  • fix(deploy): PM2 on both VPS was pointing to stale git-clone directories. Reconfigured to app/server.js in the npm global package. Updated /deploy-vps workflow to use npm pack + scp (npm registry rejects 299MB packages).

✨ Features

  • feat(provider): Synthetic (synthetic.new) — privacy-focused OpenAI-compatible inference. passthroughModels: true for dynamic HuggingFace model catalog. Initial models: Kimi K2.5, MiniMax M2.5, GLM 4.7, DeepSeek V3.2. (PR #404 by @Regis-RCR)

📋 Issues Closed

  • close #398: npm hash regression — fixed by compile-time hash-strip in prepublish
  • triage #324: Bug screenshot without steps — requested reproduction details

[2.6.2] — 2026-03-16

Sprint: module hashing fully fixed, 2 PRs merged (Anthropic tools filter + custom endpoint paths), Alibaba Cloud DashScope provider added, 3 stale issues closed.

🐛 Bug Fixes

  • fix(build): Extended webpack externals hash-strip to cover ALL serverExternalPackages, not just better-sqlite3. Next.js 16 Turbopack hashes zod, pino, and every other server-external package into names like zod-dcb22c6336e0bc69 that don't exist in node_modules at runtime. A HASH_PATTERN regex catch-all now strips the 16-char suffix and falls back to the base package name. Also added NEXT_PRIVATE_BUILD_WORKER=0 in prepublish.mjs to reinforce webpack mode, plus a post-build scan that reports any remaining hashed refs. (#396, #398, PR #403)
  • fix(chat): Anthropic-format tool names (tool.name without .function wrapper) were silently dropped by the empty-name filter introduced in #346. LiteLLM proxies requests with anthropic/ prefix in Anthropic Messages API format, causing all tools to be filtered and Anthropic to return 400: tool_choice.any may only be specified while providing tools. Fixed by falling back to tool.name when tool.function.name is absent. Added 8 regression unit tests. (PR #397)

✨ Features

  • feat(api): Custom endpoint paths for OpenAI-compatible provider nodes — configure chatPath and modelsPath per node (e.g. /v4/chat/completions) in the provider connection UI. Includes a DB migration (003_provider_node_custom_paths.sql) and URL path sanitization (no .. traversal, must start with /). (PR #400)
  • feat(provider): Alibaba Cloud DashScope added as OpenAI-compatible provider. International endpoint: dashscope-intl.aliyuncs.com/compatible-mode/v1. 12 models: qwen-max, qwen-plus, qwen-turbo, qwen3-coder-plus/flash, qwq-plus, qwq-32b, qwen3-32b, qwen3-235b-a22b. Auth: Bearer API key.

📋 Issues Closed

  • close #323: Cline connection error [object Object] — fixed in v2.3.7; instructed user to upgrade from v2.2.9
  • close #337: Kiro credit tracking — implemented in v2.5.5 (#381); pointed user to Dashboard → Usage
  • triage #402: ARM64 macOS DMG damaged — requested macOS version, exact error, and advised xattr -d com.apple.quarantine workaround

[2.6.1] — 2026-03-15

Critical startup fix: v2.6.0 global npm installs crashed with a 500 error due to a Turbopack/webpack module-name hashing bug in the Next.js 16 instrumentation hook.

🐛 Bug Fixes

  • fix(build): Force better-sqlite3 to always be required by its exact package name in the webpack server bundle. Next.js 16 compiled the instrumentation hook into a separate chunk and emitted require('better-sqlite3-<hash>') — a hashed module name that doesn't exist in node_modules — even though the package was listed in serverExternalPackages. Added an explicit externals function to the server webpack config so the bundler always emits require('better-sqlite3'), resolving the startup 500 Internal Server Error on clean global installs. (#394, PR #395)

🔧 CI

  • ci: Added workflow_dispatch to npm-publish.yml with version sync safeguard for manual triggers (#392)
  • ci: Added workflow_dispatch to docker-publish.yml, updated GitHub Actions to latest versions (#392)

[2.6.0] - 2026-03-15

Issue resolution sprint: 4 bugs fixed, logs UX improved, Kiro credit tracking added.

🐛 Bug Fixes

  • fix(media): ComfyUI and SD WebUI no longer appear in the Media page provider list when unconfigured — fetches /api/providers on mount and hides local providers with no connections (#390)
  • fix(auth): Round-robin no longer re-selects rate-limited accounts immediately after cooldown — backoffLevel is now used as primary sort key in the LRU rotation (#340)
  • fix(oauth): iFlow (and other providers that redirect to their own UI) no longer leave the OAuth modal stuck at "Waiting for Authorization" — popup-closed detector auto-transitions to manual URL input mode (#344)
  • fix(logs): Request log table is now readable in light mode — status badges, token counts, and combo tags use adaptive dark: color classes (#378)

✨ Features

  • feat(kiro): Kiro credit tracking added to usage fetcher — queries getUserCredits from AWS CodeWhisperer endpoint (#337)

🛠 Chores

  • chore(tests): Aligned test:plan3, test:fixes, test:security to use same tsx/esm loader as npm test — eliminates module resolution false negatives in targeted runs (PR #386)

[2.5.9] - 2026-03-15

Codex native passthrough fix + route body validation hardening.

🐛 Bug Fixes

  • fix(codex): Preserve native Responses API passthrough for Codex clients — avoids unnecessary translation mutations (PR #387)
  • fix(api): Validate request bodies on pricing/sync and task-routing routes — prevents crashes from malformed inputs (PR #388)
  • fix(auth): JWT secrets persist across restarts via src/lib/db/secrets.ts — eliminates 401 errors after pm2 restart (PR #388)

[2.5.8] - 2026-03-15

Build fix: restore VPS connectivity broken by v2.5.7 incomplete publish.

🐛 Bug Fixes

  • fix(build): scripts/prepublish.mjs still used deprecated --webpack flag causing Next.js standalone build to fail silently — npm publish completed without app/server.js, breaking VPS deployment

[2.5.7] - 2026-03-15

Media playground error handling fixes.

🐛 Bug Fixes

  • fix(media): Transcription "API Key Required" false positive when audio contains no speech (music, silence) — now shows "No speech detected" instead
  • fix(media): upstreamErrorResponse in audioTranscription.ts and audioSpeech.ts now returns proper JSON ({error:{message}}), enabling correct 401/403 credential error detection in the MediaPageClient
  • fix(media): parseApiError now handles Deepgram's err_msg field and detects "api key" in error messages for accurate credential error classification

[2.5.6] - 2026-03-15

Critical security/auth fixes: Antigravity OAuth broken + JWT sessions lost after restart.

🐛 Bug Fixes

  • fix(oauth) #384: Antigravity Google OAuth now correctly sends client_secret to the token endpoint. The fallback for ANTIGRAVITY_OAUTH_CLIENT_SECRET was an empty string, which is falsy — so client_secret was never included in the request, causing "client_secret is missing" errors for all users without a custom env var. Closes #383.
  • fix(auth) #385: JWT_SECRET is now persisted to SQLite (namespace='secrets') on first generation and reloaded on subsequent starts. Previously, a new random secret was generated each process startup, invalidating all existing cookies/sessions after any restart or upgrade. Affects both JWT_SECRET and API_KEY_SECRET. Closes #382.

[2.5.5] - 2026-03-15

Model list dedup fix, Electron standalone build hardening, and Kiro credit tracking.

🐛 Bug Fixes

  • fix(models) #380: GET /api/models now includes provider aliases when building the active-provider filter — models for claude (alias cc) and github (alias gh) were always shown regardless of whether a connection was configured, because PROVIDER_MODELS keys are aliases but DB connections are stored under provider IDs. Fixed by expanding each active provider ID to also include its alias via PROVIDER_ID_TO_ALIAS. Closes #353.
  • fix(electron) #379: New scripts/prepare-electron-standalone.mjs stages a dedicated /.next/electron-standalone bundle before Electron packaging. Aborts with a clear error if node_modules is a symlink (electron-builder would ship a runtime dependency on the build machine). Cross-platform path sanitization via path.basename. By @kfiramar.

✨ New Features

  • feat(kiro) #381: Kiro credit balance tracking — usage endpoint now returns credit data for Kiro accounts by calling codewhisperer.us-east-1.amazonaws.com/getUserCredits (same endpoint Kiro IDE uses internally). Returns remaining credits, total allowance, renewal date, and subscription tier. Closes #337.

[2.5.4] - 2026-03-15

Logger startup fix, login bootstrap security fix, and dev HMR reliability improvement. CI infrastructure hardened.

🐛 Bug Fixes (PRs #374, #375, #376 by @kfiramar)

  • fix(logger) #376: Restore pino transport logger path — formatters.level combined with transport.targets is rejected by pino. Transport-backed configs now strip the level formatter via getTransportCompatibleConfig(). Also corrects numeric level mapping in /api/logs/console: 30→info, 40→warn, 50→error (was shifted by one).
  • fix(login) #375: Login page now bootstraps from the public /api/settings/require-login endpoint instead of the protected /api/settings. In password-protected setups, the pre-auth page was receiving a 401 and falling back to safe defaults unnecessarily. The public route now returns all bootstrap metadata (requireLogin, hasPassword, setupComplete) with a conservative 200 fallback on error.
  • fix(dev) #374: Add localhost and 127.0.0.1 to allowedDevOrigins in next.config.mjs — HMR websocket was blocked when accessing the app via loopback address, producing repeated cross-origin warnings.

🔧 CI & Infrastructure

  • ESLint OOM fix: eslint.config.mjs now ignores vscode-extension/**, electron/**, docs/**, app/.next/**, and clipr/** — ESLint was crashing with a JS heap OOM by scanning VS Code binary blobs and compiled chunks.
  • Unit test fix: Removed stale ALTER TABLE provider_connections ADD COLUMN "group" from 2 test files — column is now part of the base schema (added in #373), causing SQLITE_ERROR: duplicate column name on every CI run.
  • Pre-commit hook: Added npm run test:unit to .husky/pre-commit — unit tests now block broken commits before they reach CI.

[2.5.3] - 2026-03-14

Critical bugfixes: DB schema migration, startup env loading, provider error state clearing, and i18n tooltip fix. Code quality improvements on top of each PR.

🐛 Bug Fixes (PRs #369, #371, #372, #373 by @kfiramar)

  • fix(db) #373: Add provider_connections.group column to base schema + backfill migration for existing databases — column was used in all queries but missing from schema definition
  • fix(i18n) #371: Replace non-existent t("deleteConnection") key with existing providers.delete key — fixes MISSING_MESSAGE: providers.deleteConnection runtime error on provider detail page
  • fix(auth) #372: Clear stale error metadata (errorCode, lastErrorType, lastErrorSource) from provider accounts after genuine recovery — previously, recovered accounts kept appearing as failed
  • fix(startup) #369: Unify env loading across npm run start, run-standalone.mjs, and Electron to respect DATA_DIR/.env → ~/.omniroute/.env → ./.env priority — prevents generating a new STORAGE_ENCRYPTION_KEY over an existing encrypted database

🔧 Code Quality

  • Documented result.success vs response?.ok patterns in auth.ts (both intentional, now explained)
  • Normalized overridePath?.trim() in electron/main.js to match bootstrap-env.mjs
  • Added preferredEnv merge order comment in Electron startup

Codex account quota policy with auto-rotation, fast tier toggle, gpt-5.4 model, and analytics label fix.

✨ New Features (PRs #366, #367, #368)

  • Codex Quota Policy (PR #366): Per-account 5h/weekly quota window toggles in Provider dashboard. Accounts are automatically skipped when enabled windows reach 90% threshold and re-admitted after resetAt. Includes quotaCache.ts with side-effect free status getter.
  • Codex Fast Tier Toggle (PR #367): Dashboard → Settings → Codex Service Tier. Default-off toggle injects service_tier: "flex" only for Codex requests, reducing cost ~80%. Full stack: UI tab + API endpoint + executor + translator + startup restore.
  • gpt-5.4 Model (PR #368): Adds cx/gpt-5.4 and codex/gpt-5.4 to the Codex model registry. Regression test included.

🐛 Bug Fixes

  • fix #356: Analytics charts (Top Provider, By Account, Provider Breakdown) now display human-readable provider names/labels instead of raw internal IDs for OpenAI-compatible providers.

Major release: strict-random routing strategy, API key access controls, connection groups, external pricing sync, and critical bug fixes for thinking models, combo testing, and tool name validation.

✨ New Features (PRs #363 & #365)

  • Strict-Random Routing Strategy: Fisher-Yates shuffle deck with anti-repeat guarantee and mutex serialization for concurrent requests. Independent decks per combo and per provider.
  • API Key Access Controls: allowedConnections (restrict which connections a key can use), is_active (enable/disable key with 403), accessSchedule (time-based access control), autoResolve toggle, rename keys via PATCH.
  • Connection Groups: Group provider connections by environment. Accordion view in Limits page with localStorage persistence and smart auto-switch.
  • External Pricing Sync (LiteLLM): 3-tier pricing resolution (user overrides → synced → defaults). Opt-in via PRICING_SYNC_ENABLED=true. MCP tool omniroute_sync_pricing. 23 new tests.
  • i18n: 30 languages updated with strict-random strategy, API key management strings. pt-BR fully translated.

🐛 Bug Fixes

  • fix #355: Stream idle timeout increased from 60s to 300s — prevents aborting extended-thinking models (claude-opus-4-6, o3, etc.) during long reasoning phases. Configurable via STREAM_IDLE_TIMEOUT_MS.
  • fix #350: Combo test now bypasses REQUIRE_API_KEY=true using internal header, and uses OpenAI-compatible format universally. Timeout extended from 15s to 20s.
  • fix #346: Tools with empty function.name (forwarded by Claude Code) are now filtered before upstream providers receive them, preventing "Invalid input[N].name: empty string" errors.

🗑️ Closed Issues

  • #341: Debug section removed — replacement is /dashboard/logs and /dashboard/health.

API Key Round-Robin support for multi-key provider setups, and confirmation of wildcard routing and quota window rolling already in place.

✨ New Features

  • API Key Round-Robin (T07): Provider connections can now hold multiple API keys (Edit Connection → Extra API Keys). Requests rotate round-robin between primary + extra keys via providerSpecificData.extraApiKeys[]. Keys are held in-memory indexed per connection — no DB schema changes required.

📝 Already Implemented (confirmed in audit)

  • Wildcard Model Routing (T13): wildcardRouter.ts with glob-style wildcard matching (gpt*, claude-?-sonnet, etc.) is already integrated into model.ts with specificity ranking.
  • Quota Window Rolling (T08): accountFallback.ts:isModelLocked() already auto-advances the window — if Date.now() > entry.until, lock is deleted immediately (no stale blocking).

UI polish, routing strategy additions, and graceful error handling for usage limits.

✨ New Features

  • Fill-First & P2C Routing Strategies: Added fill-first (drain quota before moving on) and p2c (Power-of-Two-Choices low-latency selection) to combo strategy picker, with full guidance panels and color-coded badges.
  • Free Stack Preset Models: Creating a combo with the Free Stack template now auto-fills 7 best-in-class free provider models (Gemini CLI, Kiro, iFlow×2, Qwen, NVIDIA NIM, Groq). Users just activate the providers and get a $0/month combo out-of-the-box.
  • Wider Combo Modal: Create/Edit combo modal now uses max-w-4xl for comfortable editing of large combos.

🐛 Bug Fixes

  • Limits page HTTP 500 for Codex & GitHub: getCodexUsage() and getGitHubUsage() now return a user-friendly message when the provider returns 401/403 (expired token), instead of throwing and causing a 500 error on the Limits page.
  • MaintenanceBanner false-positive: Banner no longer shows "Server is unreachable" spuriously on page load. Fixed by calling checkHealth() immediately on mount and removing stale show-state closure.
  • Provider icon tooltips: Edit (pencil) and delete icon buttons in the provider connection row now have native HTML tooltips — all 6 action icons are now self-documented.

Multiple improvements from community issue analysis, new provider support, bug fixes for token tracking, model routing, and streaming reliability.

✨ New Features

  • Task-Aware Smart Routing (T05): Automatic model selection based on request content type — coding → deepseek-chat, analysis → gemini-2.5-pro, vision → gpt-4o, summarization → gemini-2.5-flash. Configurable via Settings. New GET/PUT/POST /api/settings/task-routing API.
  • HuggingFace Provider: Added HuggingFace Router as an OpenAI-compatible provider with Llama 3.1 70B/8B, Qwen 2.5 72B, Mistral 7B, Phi-3.5 Mini.
  • Vertex AI Provider: Added Vertex AI (Google Cloud) provider with Gemini 2.5 Pro/Flash, Gemma 2 27B, Claude via Vertex.
  • Playground File Uploads: Audio upload for transcription, image upload for vision models (auto-detect by model name), inline image rendering for image generation results.
  • Model Select Visual Feedback: Already-added models in combo picker now show ✓ green badge — prevents duplicate confusion.
  • Qwen Compatibility (PR #352): Updated User-Agent and CLI fingerprint settings for Qwen provider compatibility.
  • Round-Robin State Management (PR #349): Enhanced round-robin logic to handle excluded accounts and maintain rotation state correctly.
  • Clipboard UX (PR #360): Hardened clipboard operations with fallback for non-secure contexts; Claude tool normalization improvements.

🐛 Bug Fixes

  • Fix #302 — OpenAI SDK stream=False drops tool_calls: T01 Accept header negotiation no longer forces streaming when body.stream is explicitly false. Was causing tool_calls to be silently dropped when using the OpenAI Python SDK in non-streaming mode.
  • Fix #73 — Claude Haiku routed to OpenAI without provider prefix: claude-* models sent without a provider prefix now correctly route to the antigravity (Anthropic) provider. Added gemini-*/gemma-*gemini heuristic as well.
  • Fix #74 — Token counts always 0 for Antigravity/Claude streaming: The message_start SSE event which carries input_tokens was not being parsed by extractUsage(), causing all input token counts to drop. Input/output token tracking now works correctly for streaming responses.
  • Fix #180 — Model import duplicates with no feedback: ModelSelectModal now shows ✓ green highlight for models already in the combo, making it obvious they're already added.
  • Media page generation errors: Image results now render as <img> tags instead of raw JSON. Transcription results shown as readable text. Credential errors show an amber banner instead of silent failure.
  • Token refresh button on provider page: Manual token refresh UI added for OAuth providers.

🔧 Improvements

  • Provider Registry: HuggingFace and Vertex AI added to providerRegistry.ts and providers.ts (frontend).
  • Read Cache: New src/lib/db/readCache.ts for efficient DB read caching.
  • Quota Cache: Improved quota cache with TTL-based eviction.

📦 Dependencies

  • dompurify → 3.3.3 (PR #347)
  • undici → 7.24.2 (PR #348, #361)
  • docker/setup-qemu-action → v4 (PR #342)
  • docker/setup-buildx-action → v4 (PR #343)

📁 New Files

File Purpose
open-sse/services/taskAwareRouter.ts Task-aware routing logic (7 task types)
src/app/api/settings/task-routing/route.ts Task routing config API
src/app/api/providers/[id]/refresh/route.ts Manual OAuth token refresh
src/lib/db/readCache.ts Efficient DB read cache
src/shared/utils/clipboard.ts Hardened clipboard with fallback

[2.4.1] - 2026-03-13

🐛 Fix

  • Combos modal: Free Stack visible and prominent — Free Stack template was hidden (4th in 3-column grid). Fixed: moved to position 1, switched to 2x2 grid so all 4 templates are visible, green border + FREE badge highlight.

[2.4.0] - 2026-03-13

Major release — Free Stack ecosystem, transcription playground overhaul, 44+ providers, comprehensive free tier documentation, and UI improvements across the board.

✨ Features

  • Combos: Free Stack template — New 4th template "Free Stack ($0)" using round-robin across Kiro + iFlow + Qwen + Gemini CLI. Suggests the pre-built zero-cost combo on first use.
  • Media/Transcription: Deepgram as default — Deepgram (Nova 3, $200 free) is now the default transcription provider. AssemblyAI ($50 free) and Groq Whisper (free forever) shown with free credit badges.
  • README: "Start Free" section — New early-README 5-step table showing how to set up zero-cost AI in minutes.
  • README: Free Transcription Combo — New section with Deepgram/AssemblyAI/Groq combo suggestion and per-provider free credit details.
  • providers.ts: hasFree flag — NVIDIA NIM, Cerebras, and Groq marked with hasFree badge and freeNote for the providers UI.
  • i18n: templateFreeStack keys — Free Stack combo template translated and synced to all 30 languages.

[2.3.16] - 2026-03-13

📖 Documentation

  • README: 44+ Providers — Updated all 3 occurrences of "36+ providers" to "44+" reflecting the actual codebase count (44 providers in providers.ts)
  • README: New Section "🆓 Free Models — What You Actually Get" — Added 7-provider table with per-model rate limits for: Kiro (Claude unlimited via AWS Builder ID), iFlow (5 models unlimited), Qwen (4 models unlimited), Gemini CLI (180K/mo), NVIDIA NIM (~40 RPM dev-forever), Cerebras (1M tok/day / 60K TPM), Groq (30 RPM / 14.4K RPD). Includes the /usr/bin/bash Ultimate Free Stack combo recommendation.
  • README: Pricing Table Updated — Added Cerebras to API KEY tier, fixed NVIDIA from "1000 credits" to "dev-forever free", updated iFlow/Qwen model counts and names
  • README: iFlow 8→5 models (named: kimi-k2-thinking, qwen3-coder-plus, deepseek-r1, minimax-m2, kimi-k2)
  • README: Qwen 3→4 models (named: qwen3-coder-plus, qwen3-coder-flash, qwen3-coder-next, vision-model)

[2.3.15] - 2026-03-13

✨ Features

  • Auto-Combo Dashboard (Tier Priority): Added 🏷️ Tier as the 7th scoring factor label in the /dashboard/auto-combo factor breakdown display — all 7 Auto-Combo scoring factors are now visible.
  • i18n — autoCombo section: Added 20 new translation keys for the Auto-Combo dashboard (title, status, modePack, providerScores, factorTierPriority, etc.) to all 30 language files.

[2.3.14] - 2026-03-13

🐛 Bug Fixes

  • iFlow OAuth (#339): Restored the valid default clientSecret — was previously an empty string, causing "Bad client credentials" on every connect attempt. The public credential is now the default fallback (overridable via IFLOW_OAUTH_CLIENT_SECRET env var).
  • MITM server not found (#335): prepublish.mjs now compiles src/mitm/*.ts to JavaScript using tsc before copying to the npm bundle. Previously only raw .ts files were copied — meaning server.js never existed in npm/Volta global installs.
  • GeminiCLI missing projectId (#338): Instead of throwing a hard 500 error when projectId is missing from stored credentials (e.g. after Docker restart), OmniRoute now logs a warning and attempts the request — returning a meaningful provider-side error instead of an OmniRoute crash.
  • Electron version mismatch (#323): Synced electron/package.json version to 2.3.13 (was 2.0.13) so the desktop binary version matches the npm package.

✨ New Models (#334)

  • Kiro: claude-sonnet-4, claude-opus-4.6, deepseek-v3.2, minimax-m2.1, qwen3-coder-next, auto
  • Codex: gpt5.4

🔧 Improvements

  • Tier Scoring (API + Validation): Added tierPriority (weight 0.05) to the ScoringWeights Zod schema and the combos/auto API route — the 7th scoring factor is now fully accepted by the REST API and validated on input. stability weight adjusted from 0.10 to 0.05 to keep total sum = 1.0.

✨ New Features

  • Tiered Quota Scoring (Auto-Combo): Added tierPriority as a 7th scoring factor — accounts with Ultra/Pro tiers are now preferred over Free tiers when other factors are equal. New optional fields accountTier and quotaResetIntervalSecs on ProviderCandidate. All 4 mode packs updated (ship-fast, cost-saver, quality-first, offline-friendly).
  • Intra-Family Model Fallback (T5): When a model is unavailable (404/400/403), OmniRoute now automatically falls back to sibling models from the same family before returning an error (modelFamilyFallback.ts).
  • Configurable API Bridge Timeout: API_BRIDGE_PROXY_TIMEOUT_MS env var lets operators tune the proxy timeout (default 30s). Fixes 504 errors on slow upstream responses. (#332)
  • Star History: Replaced star-history.com widget with starchart.cc (?variant=adaptive) in all 30 READMEs — adapts to light/dark theme, real-time updates.

🐛 Bug Fixes

  • Auth — First-time password: INITIAL_PASSWORD env var is now accepted when setting the first dashboard password. Uses timingSafeEqual for constant-time comparison, preventing timing attacks. (#333)
  • README Truncation: Fixed a missing </details> closing tag in the Troubleshooting section that caused GitHub to stop rendering everything below it (Tech Stack, Docs, Roadmap, Contributors).
  • pnpm install: Removed redundant @swc/helpers override from package.json that conflicted with the direct dependency, causing EOVERRIDE errors on pnpm. Added pnpm.onlyBuiltDependencies config.
  • CLI Path Injection (T12): Added isSafePath() validator in cliRuntime.ts to block path traversal and shell metacharacters in CLI_*_BIN env vars.
  • CI: Regenerated package-lock.json after override removal to fix npm ci failures on GitHub Actions.

🔧 Improvements

  • Response Format (T1): response_format (json_schema/json_object) now injected as a system prompt for Claude, enabling structured output compatibility.
  • 429 Retry (T2): Intra-URL retry for 429 responses (2× attempts with 2s delay) before falling back to next URL.
  • Gemini CLI Headers (T3): Added User-Agent and X-Goog-Api-Client fingerprint headers for Gemini CLI compatibility.
  • Pricing Catalog (T9): Added deepseek-3.1, deepseek-3.2, and qwen3-coder-next pricing entries.

📁 New Files

File Purpose
open-sse/services/modelFamilyFallback.ts Model family definitions and intra-family fallback logic

Fixed

  • KiloCode: kilocode healthcheck timeout already fixed in v2.3.11
  • OpenCode: Add opencode to cliRuntime registry with 15s healthcheck timeout
  • OpenClaw / Cursor: Increase healthcheck timeout to 15s for slow-start variants
  • VPS: Install droid and openclaw npm packages; activate CLI_EXTRA_PATHS for kiro-cli
  • cliRuntime: Add opencode tool registration and increase timeout for continue

[2.3.11] - 2026-03-12

Fixed

  • KiloCode healthcheck: Increase healthcheckTimeoutMs from 4000ms to 15000ms — kilocode renders an ASCII logo banner on startup causing false healthcheck_failed on slow/cold-start environments

[2.3.10] - 2026-03-12

Fixed

  • Lint: Fix check:any-budget:t11 failure — replace as any with as Record<string, unknown> in OAuthModal.tsx (3 occurrences)

Docs

  • CLI-TOOLS.md: Complete guide for all 11 CLI tools (claude, codex, gemini, opencode, cline, kilocode, continue, kiro-cli, cursor, droid, openclaw)
  • i18n: CLI-TOOLS.md synced to 30 languages with translated title + intro

[2.3.8] - 2026-03-12

[2.3.9] - 2026-03-12

Added

  • /v1/completions: New legacy OpenAI completions endpoint — accepts both prompt string and messages array, normalizes to chat format automatically
  • EndpointPage: Now shows all 3 OpenAI-compatible endpoint types: Chat Completions, Responses API, and Legacy Completions
  • i18n: Added completionsLegacy/completionsLegacyDesc to 30 language files

Fixed

  • OAuthModal: Fix [object Object] displayed on all OAuth connection errors — properly extract .message from error response objects in all 3 throw new Error(data.error) calls (exchange, device-code, authorize)
  • Affects Cline, Codex, GitHub, Qwen, Kiro, and all other OAuth providers

[2.3.7] - 2026-03-12

Fixed

  • Cline OAuth: Add decodeURIComponent before base64 decode so URL-encoded auth codes from the callback URL are parsed correctly, fixing "invalid or expired authorization code" errors on remote (LAN IP) setups
  • Cline OAuth: mapTokens now populates name = firstName + lastName || email so Cline accounts show real user names instead of "Account #ID"
  • OAuth account names: All OAuth exchange flows (exchange, poll, poll-callback) now normalize name = email when name is missing, so every OAuth account shows its email as the display label in the Providers dashboard
  • OAuth account names: Removed sequential "Account N" fallback in db/providers.ts — accounts with no email/name now use a stable ID-based label via getAccountDisplayName() instead of a sequential number that changes when accounts are deleted

[2.3.6] - 2026-03-12

Fixed

  • Provider test batch: Fixed Zod schema to accept providerId: null (frontend sends null for non-provider modes); was incorrectly returning "Invalid request" for all batch tests
  • Provider test modal: Fixed [object Object] display by normalizing API error objects to strings before rendering in setTestResults and ProviderTestResultsView
  • i18n: Added missing keys cliTools.toolDescriptions.opencode, cliTools.toolDescriptions.kiro, cliTools.guides.opencode, cliTools.guides.kiro to en.json
  • i18n: Synchronized 1111 missing keys across all 29 non-English language files using English values as fallbacks

[2.3.5] - 2026-03-11

Fixed

  • @swc/helpers: Added permanent postinstall fix to copy @swc/helpers into the standalone app's node_modules — prevents MODULE_NOT_FOUND crash on global npm installs

[2.3.4] - 2026-03-10

Added

  • Multiple provider integrations and dashboard improvements