Sprint: New OpenCode providers, embedding credentials fix, CLI masked key bug, CACHE_TAG_PATTERN fix.
- CLI tools save masked API key to config files —
claude-settings,cline-settings, andopenclaw-settingsPOST routes now accept akeyIdparam and resolve the real API key from DB before writing to disk.ClaudeToolCardupdated to sendkeyIdinstead of the masked display string. Fixes #523, #526. - Custom embedding providers:
No credentialserror —/v1/embeddingsnow trackscredentialsProviderIdseparately from the routing prefix, so credentials are fetched from the matching provider node ID rather than the public prefix string. Fixes a regression wheregoogle/gemini-embedding-001and similar custom-provider models would always fail with a credentials error. Fixes #532-related. (PR #528 by @jacob2826) - Context cache protection regex misses
\nprefix —CACHE_TAG_PATTERNincomboAgentMiddleware.tsupdated to match both literal\n(backslash-n) and actual newline U+000A thatcombo.tsstreaming injects around the<omniModel>tag after fix #515. Fixes #531.
- OpenCode Zen — Free tier gateway at
opencode.ai/zen/v1with 3 models:minimax-m2.5-free,big-pickle,gpt-5-nano - OpenCode Go — Subscription service at
opencode.ai/zen/go/v1with 4 models:glm-5,kimi-k2.5,minimax-m2.7(Claude format),minimax-m2.5(Claude format) - Both providers use the new
OpencodeExecutorwhich routes dynamically to/chat/completions,/messages,/responses, or/models/{model}:generateContentbased on the requested model. (PR #530 by @kang-heewon)
Sprint: Bug fixes — preserve Codex prompt cache key, fix tagContent JSON escaping, sync expired token status to DB.
-
fix(translator): Preserve
prompt_cache_keyin Responses API → Chat Completions translation (#517) — The field is a cache-affinity signal used by Codex; stripping it was preventing prompt cache hits. Fixed inopenai-responses.tsandresponsesApiHelper.ts. -
fix(combo): Escape
\nintagContentso injected JSON string is valid (#515) — Template literal newlines (U+000A) are not allowed unescaped inside JSON string values. Replaced with\\nliteral sequences inopen-sse/services/combo.ts. -
fix(usage): Sync expired token status back to DB on live auth failure (#491) — When the Limits & Quotas live check returns 401/403, the connection
testStatusis now updated to"expired"in the database so the Providers page reflects the same degraded state. Fixed insrc/app/api/usage/[connectionId]/route.ts.
Sprint: Add 5 new free AI providers — LongCat, Pollinations, Cloudflare AI, Scaleway, AI/ML API.
- feat(providers/longcat): Add LongCat AI (
lc/) — 50M tokens/day free (Flash-Lite) + 500K/day (Chat/Thinking) during public beta. OpenAI-compatible, standard Bearer auth. - feat(providers/pollinations): Add Pollinations AI (
pol/) — no API key required. Proxies GPT-5, Claude, Gemini, DeepSeek V3, Llama 4 (1 req/15s free). Custom executor handles optional auth. - feat(providers/cloudflare-ai): Add Cloudflare Workers AI (
cf/) — 10K Neurons/day free (~150 LLM responses or 500s Whisper audio). 50+ models on global edge. Custom executor builds dynamic URL withaccountIdfrom credentials. - feat(providers/scaleway): Add Scaleway Generative APIs (
scw/) — 1M free tokens for new accounts. EU/GDPR compliant (Paris). Qwen3 235B, Llama 3.1 70B, Mistral Small 3.2. - feat(providers/aimlapi): Add AI/ML API (
aiml/) — $0.025/day free credit, 200+ models (GPT-4o, Claude, Gemini, Llama) via single aggregator endpoint.
- feat(providers/together): Add
hasFree: true+ 3 permanently free model IDs:Llama-3.3-70B-Instruct-Turbo-Free,Llama-Vision-Free,DeepSeek-R1-Distill-Llama-70B-Free - feat(providers/gemini): Add
hasFree: true+freeNote(1,500 req/day, no credit card needed, aistudio.google.com) - chore(providers/gemini): Rename display name to
Gemini (Google AI Studio)for clarity
- feat(executors/pollinations): New
PollinationsExecutor— omitsAuthorizationheader when no API key provided - feat(executors/cloudflare-ai): New
CloudflareAIExecutor— dynamic URL construction requiresaccountIdin provider credentials - feat(executors): Register
pollinations,pol,cloudflare-ai,cfexecutor mappings
- docs(readme): Expanded free combo stack to 11 providers ($0 forever)
- docs(readme): Added 4 new free provider sections (LongCat, Pollinations, Cloudflare AI, Scaleway) with model tables
- docs(readme): Updated pricing table with 4 new free tier rows
- docs(i18n/pt-BR): Updated pricing table + added LongCat/Pollinations/Cloudflare AI/Scaleway sections in Portuguese
- docs(new-features/ai): 10 task spec files + master implementation plan in
docs/new-features/ai/
- Test suite: 821 tests, 0 failures (unchanged)
Sprint: Fix media transcription (Deepgram/HuggingFace Content-Type, language detection) and TTS error display.
- fix(transcription): Deepgram and HuggingFace audio transcription now correctly map
video/mp4→audio/mp4and other media MIME types via newresolveAudioContentType()helper. Previously, uploading.mp4files consistently returned "No speech detected" because Deepgram was receivingContent-Type: video/mp4. - fix(transcription): Added
detect_language=trueto Deepgram requests — auto-detects audio language (Portuguese, Spanish, etc.) instead of defaulting to English. Fixes non-English transcriptions returning empty or garbage results. - fix(transcription): Added
punctuate=trueto Deepgram requests for higher-quality transcription output with correct punctuation. - fix(tts):
[object Object]error display in Text-to-Speech responses fixed in bothaudioSpeech.tsandaudioTranscription.ts. TheupstreamErrorResponse()function now correctly extracts nested string messages from providers like ElevenLabs that return{ error: { message: "...", status_code: 401 } }instead of a flat error string.
- Test suite: 821 tests, 0 failures (unchanged)
- #508 — Tool call format regression: requested proxy logs and provider chain info (
needs-info) - #510 — Windows CLI healthcheck path: requested shell/Node version info (
needs-info) - #485 — Kiro MCP tool calls: closed as external Kiro issue (not OmniRoute)
- #442 — Baseten /models endpoint: closed (documented manual workaround)
- #464 — Key provisioning API: acknowledged as roadmap item
Sprint: Fix SSE omniModel data loss, merge per-protocol model compatibility.
- #511 — Critical:
<omniModel>tag was sent afterfinish_reason:stopin SSE streams, causing data loss. Tag is now injected into the first non-empty content chunk, guaranteeing delivery before SDKs close the connection.
- PR #512 (@zhangqiang8vip): Per-protocol model compatibility —
normalizeToolCallIdandpreserveOpenAIDeveloperRolecan now be configured per client protocol (OpenAI, Claude, Responses API). NewcompatByProtocolfield in model config with Zod validation.
- #510 — Windows CLI healthcheck_failed: requested PATH/version info
- #509 — Turbopack Electron regression: upstream Next.js bug, documented workarounds
- #508 — macOS black screen: suggested
--disable-gpuworkaround
Sprint: Cross-platform machineId fix, per-API-key rate limits, streaming context cache, Alibaba DashScope, search analytics, ZWS v5, and 8 issues closed.
- feat(search): Search Analytics tab in
/dashboard/analytics— provider breakdown, cache hit rate, cost tracking. New API:GET /api/v1/search/analytics(#feat/search-provider-routing) - feat(provider): Alibaba Cloud DashScope added with custom endpoint path validation — configurable
chatPathandmodelsPathper node (#feat/custom-endpoint-paths) - feat(api): Per-API-key request-count limits —
max_requests_per_dayandmax_requests_per_minutecolumns with in-memory sliding-window enforcement returning HTTP 429 (#452) - feat(dev): ZWS v5 — HMR leak fix (485 DB connections → 1), memory 2.4GB → 195MB,
globalThissingletons, Edge Runtime warning fix (@zhangqiang8vip)
- fix(#506): Cross-platform
machineId—getMachineIdRaw()rewritten with try/catch waterfall (Windows REG.exe → macOS ioreg → Linux file read → hostname →os.hostname()). Eliminatesprocess.platformbranching that Next.js bundler dead-code-eliminated, fixing'head' is not recognizedon Windows. Also fixes #466. - fix(#493): Custom provider model naming — removed incorrect prefix stripping in
DefaultExecutor.transformRequest()that mangled org-scoped model IDs likezai-org/GLM-5-FP8. - fix(#490): Streaming + context cache protection —
TransformStreamintercepts SSE to inject<omniModel>tag before[DONE]marker, enabling context cache protection for streaming responses. - fix(#458): Combo schema validation —
system_message,tool_filter_regex,context_cache_protectionfields now pass Zod validation on save. - fix(#487): KIRO MITM card cleanup — removed ZWS_README, generified
AntigravityToolCardto use dynamic tool metadata.
- Added Anthropic-format tools filter unit tests (PR #397) — 8 regression tests for
tool.namewithout.functionwrapper - Test suite: 821 tests, 0 failures (up from 813)
- #506 — Windows machineId
headnot recognized (fixed) - #493 — Custom provider model naming (fixed)
- #490 — Streaming context cache (fixed)
- #452 — Per-API-key request limits (implemented)
- #466 — Windows login failure (same root cause as #506)
- #504 — MITM inactive (expected behavior)
- #462 — Gemini CLI PSA (resolved)
- #434 — Electron app crash (duplicate of #402)
Sprint: Merge community PRs, fix KIRO MITM card, dependency updates.
- PR #498 (@Sajid11194): Fix Windows machine ID crash (
undefined\REG.exe). Replacesnode-machine-idwith native OS registry queries. Closes #486. - PR #497 (@zhangqiang8vip): Fix dev-mode HMR resource leaks — 485 leaked DB connections → 1, memory 2.4GB → 195MB.
globalThissingletons, Edge Runtime warning fix, Windows test stability. (+1168/-338 across 22 files) - PRs #499-503 (Dependabot): GitHub Actions updates —
docker/build-push-action@7,actions/checkout@6,peter-evans/dockerhub-description@5,docker/setup-qemu-action@4,docker/login-action@4.
- #505 — KIRO MITM card now displays tool-specific instructions (
api.anthropic.com) instead of Antigravity-specific text. - #504 — Responded with UX clarification (MITM "Inactive" is expected behavior when proxy is not running).
Sprint: Fix OAuth batch test crash, add "Test All" button to individual provider pages.
- OAuth batch test crash (ERR_CONNECTION_REFUSED): Replaced sequential for-loop with 5-connection concurrency limit + 30s per-connection timeout via
Promise.race()+Promise.allSettled(). Prevents server crash when testing large OAuth provider groups (~30+ connections).
- "Test All" button on provider pages: Individual provider pages (e.g.,
/providers/codex) now show a "Test All" button in the Connections header when there are 2+ connections. UsesPOST /api/providers/test-batchwith{mode: "provider", providerId}. Results displayed in a modal with pass/fail summary and per-connection diagnosis.
Sprint: Merge PR #495 (Bottleneck 429 drop), fix #496 (custom embedding providers), triage features.
- Bottleneck 429 infinite wait (PR #495 by @xandr0s): On 429,
limiter.stop({ dropWaitingJobs: true })immediately fails all queued requests so upstream callers can trigger fallback. Limiter is deleted from Map so next request creates a fresh instance. - Custom embedding models unresolvable (#496):
POST /v1/embeddingsnow resolves custom embedding models from ALL provider_nodes (not just localhost). Enables models likegoogle/gemini-embedding-001added via dashboard.
- #452 — Per-API-key request-count limits (acknowledged, on roadmap)
- #464 — Auto-issue API keys with provider/account limits (needs more detail)
- #488 — Auto-update model lists (acknowledged, on roadmap)
- #496 — Custom embedding provider resolution (fixed)
Sprint: Merge PR #494 (MiniMax role fix), fix KIRO MITM dashboard, triage 8 issues.
- MiniMax developer→system role fix (PR #494 by @zhangqiang8vip): Per-model
preserveDeveloperRoletoggle. Adds "Compatibility" UI in providers page. Fixes 422 "role param error" for MiniMax and similar gateways. - roleNormalizer:
normalizeDeveloperRole()now acceptspreserveDeveloperRoleparameter with tri-state behavior (undefined=keep, true=keep, false=convert). - DB: New
getModelPreserveOpenAIDeveloperRole()andmergeModelCompatOverride()inmodels.ts.
- KIRO MITM dashboard (#481/#487):
CLIToolsPageClientnow routes anyconfigType: "mitm"tool toAntigravityToolCard(MITM Start/Stop controls). Previously only Antigravity was hardcoded. - AntigravityToolCard generic: Uses
tool.image,tool.description,tool.idinstead of hardcoded Antigravity values. Guards against missingdefaultModels.
- Removed
ZWS_README_V2.md(development-only docs from PR #494).
- #487 — Closed (KIRO MITM fixed in this release)
- #486 — needs-info (Windows REG.exe PATH issue)
- #489 — needs-info (Antigravity projectId missing, OAuth reconnect needed)
- #492 — needs-info (missing app/server.js on mise-managed Node)
- #490 — Acknowledged (streaming + context cache blocking, fix planned)
- #491 — Acknowledged (Codex auth state inconsistency)
- #493 — Acknowledged (Modal provider model name prefix, workaround provided)
- #488 — Feature request backlog (auto-update model lists)
Sprint: Fix zombie SSE streams, context cache first-turn, KIRO MITM, and triage 5 external issues.
- Zombie SSE Streams (#473): Reduce
STREAM_IDLE_TIMEOUT_MSfrom 300s → 120s for faster combo fallback when providers hang mid-stream. Configurable via env var. - Context Cache Tag (#474): Fix
injectModelTag()to handle first-turn requests (no assistant messages) — context cache protection now works from the very first response. - KIRO MITM (#481): Change KIRO
configTypefromguide→mitmso the dashboard renders MITM Start/Stop controls. - E2E Test (CI): Fix
providers-bailian-coding-plan.spec.ts— dismiss pre-existing modal overlay before clicking Add API Key button.
- #473 — Zombie SSE streams bypass combo fallback
- #474 — Context cache
<omniModel>tag missing on first turn - #481 — MITM for KIRO not activatable from dashboard
- #468 — Gemini CLI remote server (superseded by #462 deprecation)
- #438 — Claude unable to write files (external CLI issue)
- #439 — AppImage doesn't work (documented libfuse2 workaround)
- #402 — ARM64 DMG "damaged" (documented xattr -cr workaround)
- #460 — CLI not runnable on Windows (documented PATH fix)
Sprint: Gemini CLI deprecation, VM guide i18n fix, dependabot security fix, provider schema expansion.
- Gemini CLI Deprecation (#462): Mark
gemini-cliprovider as deprecated with warning — Google restricts third-party OAuth usage from March 2026 - Provider Schema (#462): Expand Zod validation with
deprecated,deprecationReason,hasFree,freeNote,authHint,apiHintoptional fields
- VM Guide i18n (#471): Add
VM_DEPLOYMENT_GUIDE.mdto i18n translation pipeline, regenerate all 30 locale translations from English source (were stuck in Portuguese)
- deps: Bump
flatted3.3.3 → 3.4.2 — fixes CWE-1321 prototype pollution (#484, @dependabot)
- #472 — Model Aliases regression (fixed in v2.8.2)
- #471 — VM guide translations broken
- #483 — Trailing
data: nullafter[DONE](fixed in v2.8.3)
- #484 — deps: bump flatted from 3.3.3 to 3.4.2 (@dependabot)
Sprint: Czech i18n, SSE protocol fix, VM guide translation.
- Czech Language (#482): Full Czech (cs) i18n — 22 docs, 2606 UI strings, language switcher updates (@zen0bit)
- VM Deployment Guide: Translated from Portuguese to English as the source document (@zen0bit)
- SSE Protocol (#483): Stop sending trailing
data: nullafter[DONE]signal — fixesAI_TypeValidationErrorin strict AI SDK clients (Zod-based validators)
- #482 — Add Czech language + Fix VM_DEPLOYMENT_GUIDE.md English source (@zen0bit)
Sprint: 2 merged PRs, model aliases routing fix, log export, and issue triage.
- Log Export: New Export button on
/dashboard/logswith time range dropdown (1h, 6h, 12h, 24h). Downloads JSON of request/proxy/call logs via/api/logs/exportAPI (#user-request)
- Model Aliases Routing (#472): Settings → Model Aliases now correctly affect provider routing, not just format detection. Previously
resolveModelAlias()output was only used forgetModelTargetFormat()but the original model ID was sent to the provider - Stream Flush Usage (#480): Usage data from the last SSE event in the buffer is now correctly extracted during stream flush (merged from @prakersh)
- #480 — Extract usage from remaining buffer in flush handler (@prakersh)
- #479 — Add missing Codex 5.3/5.4 and Anthropic model ID pricing entries (@prakersh)
Sprint: Five community PRs — streaming call log fixes, Kiro compatibility, cache token analytics, Chinese translation, and configurable tool call IDs.
- feat(logs): Call log response content now correctly accumulated from raw provider chunks (OpenAI/Claude/Gemini) before translation, fixing empty response payloads in streaming mode (#470, @zhangqiang8vip)
- feat(providers): Per-model configurable 9-char tool call ID normalization (Mistral-style) — only models with the option enabled get truncated IDs (#470)
- feat(api): Key PATCH API expanded to support
allowedConnections,name,autoResolve,isActive, andaccessSchedulefields (#470) - feat(dashboard): Response-first layout in request log detail UI (#470)
- feat(i18n): Improved Chinese (zh-CN) translation — complete retranslation (#475, @only4copilot)
- fix(kiro): Strip injected
modelfield from request body — Kiro API rejects unknown top-level fields (#478, @prakersh) - fix(usage): Include cache read + cache creation tokens in usage history input totals for accurate analytics (#477, @prakersh)
- fix(callLogs): Support Claude format usage fields (
input_tokens/output_tokens) alongside OpenAI format, include all cache token variants (#476, @prakersh)
Sprint: Bailian Coding Plan provider with editable base URLs, plus community contributions for Alibaba Cloud and Kimi Coding.
- feat(providers): Added Bailian Coding Plan (
bailian-coding-plan) — Alibaba Model Studio with Anthropic-compatible API. Static catalog of 8 models including Qwen3.5 Plus, Qwen3 Coder, MiniMax M2.5, GLM 5, and Kimi K2.5. Includes custom auth validation (400=valid, 401/403=invalid) (#467, @Mind-Dragon) - feat(admin): Editable default URL in Provider Admin create/edit flows — users can configure custom base URLs per connection. Persisted in
providerSpecificData.baseUrlwith Zod schema validation rejecting non-http(s) schemes (#467)
- Added 30+ unit tests and 2 e2e scenarios for Bailian Coding Plan provider covering auth validation, schema hardening, route-level behavior, and cross-layer integration
Sprint: Two new community-contributed providers (Alibaba Cloud Coding, Kimi Coding API-key) and Docker pino fix.
- feat(providers): Added Alibaba Cloud Coding Plan support with two OpenAI-compatible endpoints —
alicode(China) andalicode-intl(International), each with 8 models (#465, @dtk1985) - feat(providers): Added dedicated
kimi-coding-apikeyprovider path — API-key-based Kimi Coding access is no longer forced through OAuth-onlykimi-codingroute. Includes registry, constants, models API, config, and validation test (#463, @Mind-Dragon)
- fix(docker): Added missing
split2dependency to Docker image —pino-abstract-transportrequires it at runtime but it was not being copied into the standalone container, causingCannot find module 'split2'crashes (#459)
Sprint: Codex responses subpath passthrough natively supported, Windows MITM crash fixed, and Combos agent schemas adjusted.
- feat(codex): Native responses subpath passthrough for Codex — natively routes
POST /v1/responses/compactto Codex upstream, maintaining Claude Code compatibility without stripping the/compactsuffix (#457)
- fix(combos): Zod schemas (
updateComboSchemaandcreateComboSchema) now includesystem_message,tool_filter_regex, andcontext_cache_protection. Fixes bug where agent-specific settings created via the dashboard were silently discarded by the backend validation layer (#458) - fix(mitm): Kiro MITM profile crash on Windows fixed —
node-machine-idfailed due to missingREG.exeenv, and the fallback threw a fatalcrypto is not definederror. Fallback now safely and correctly imports crypto (#456)
Sprint: Budget save bug + combo agent features UI + omniModel tag security fix.
- fix(budget): "Save Limits" no longer returns 422 —
warningThresholdis now correctly sent as fraction (0–1) instead of percentage (0–100) (#451) - fix(combos):
<omniModel>internal cache tag is now stripped before forwarding requests to providers, preventing cache session breaks (#454)
- feat(combos): Agent Features section added to combo create/edit modal — expose
system_messageoverride,tool_filter_regex, andcontext_cache_protectiondirectly from the dashboard (#454)
Sprint: Docker pino crash, Codex CLI responses worker fix, package-lock sync.
- fix(docker):
pino-abstract-transportandpino-prettynow explicitly copied in Docker runner stage — Next.js standalone trace misses these peer deps, causingCannot find module pino-abstract-transportcrash on startup (#449) - fix(responses): Remove
initTranslators()from/v1/responsesroute — was crashing Next.js worker withthe worker has exiteduncaughtException on Codex CLI requests (#450)
- chore(deps):
package-lock.jsonnow committed on every version bump to ensure Dockernpm ciuses exact dependency versions
Sprint: UX improvements and Windows CLI healthcheck fix.
- fix(ux): Show default password hint on login page — new users now see
"Default password: 123456"below the password input (#437) - fix(cli): Claude CLI and other npm-installed tools now correctly detected as runnable on Windows — spawn uses
shell:trueto resolve.cmdwrappers via PATHEXT (#447)
Sprint: Search Tools dashboard, i18n fixes, Copilot limits, Serper validation fix.
- feat(search): Add Search Playground (10th endpoint), Search Tools page with Compare Providers/Rerank Pipeline/Search History, local rerank routing, auth guards on search API (#443 by @Regis-RCR)
- New route:
/dashboard/search-tools - Sidebar entry under Debug section
GET /api/search/providersandGET /api/search/statswith auth guards- Local provider_nodes routing for
/v1/rerank - 30+ i18n keys in search namespace
- New route:
- fix(search): Fix Brave news normalizer (was returning 0 results), enforce max_results truncation post-normalization, fix Endpoints page fetch URL (#443 by @Regis-RCR)
- fix(analytics): Localize analytics day/date labels — replace hardcoded Portuguese strings with
Intl.DateTimeFormat(locale)(#444 by @hijak) - fix(copilot): Correct GitHub Copilot account type display, filter misleading unlimited quota rows from limits dashboard (#445 by @hijak)
- fix(providers): Stop rejecting valid Serper API keys — treat non-4xx responses as valid authentication (#446 by @hijak)
Sprint: Codex direct API quota fallback fix.
- fix(codex): Block weekly-exhausted accounts in direct API fallback (#440)
resolveQuotaWindow()prefix matching:"weekly"now matches"weekly (7d)"cache keysapplyCodexWindowPolicy()enforcesuseWeekly/use5htoggles correctly- 4 new regression tests (766 total)
Sprint: Light mode UI contrast fixes.
- fix(logs): Fix light mode contrast in request logs filter buttons and combo badge (#378)
- Error/Success/Combo filter buttons now readable in light mode
- Combo row badge uses stronger violet in light mode
Sprint: Unified web search routing (POST /v1/search) with 5 providers + Next.js 16.1.7 security fixes (6 CVEs).
- feat(search): Unified web search routing —
POST /v1/searchwith 5 providers (Serper, Brave, Perplexity, Exa, Tavily)- Auto-failover across providers, 6,500+ free searches/month
- In-memory cache with request coalescing (configurable TTL)
- Dashboard: Search Analytics tab in
/dashboard/analyticswith provider breakdown, cache hit rate, cost tracking - New API:
GET /api/v1/search/analyticsfor search request statistics - DB migration:
request_typecolumn oncall_logsfor non-chat request tracking - Zod validation (
v1SearchSchema), auth-gated, cost recorded viarecordCost()
- deps: Next.js 16.1.6 → 16.1.7 — fixes 6 CVEs:
- Critical: CVE-2026-29057 (HTTP request smuggling via http-proxy)
- High: CVE-2026-27977, CVE-2026-27978 (WebSocket + Server Actions)
- Medium: CVE-2026-27979, CVE-2026-27980, CVE-2026-jcc7
| File | Purpose |
|---|---|
open-sse/handlers/search.ts |
Search handler with 5-provider routing |
open-sse/config/searchRegistry.ts |
Provider registry (auth, cost, quota, TTL) |
open-sse/services/searchCache.ts |
In-memory cache with request coalescing |
src/app/api/v1/search/route.ts |
Next.js route (POST + GET) |
src/app/api/v1/search/analytics/route.ts |
Search stats API |
src/app/(dashboard)/dashboard/analytics/SearchAnalyticsTab.tsx |
Analytics dashboard tab |
src/lib/db/migrations/007_search_request_type.sql |
DB migration |
tests/unit/search-registry.test.mjs |
277 lines of unit tests |
Sprint: ClawRouter-inspired features — toolCalling flag, multilingual intent detection, benchmark-driven fallback, request deduplication, pluggable RouterStrategy, Grok-4 Fast + GLM-5 + MiniMax M2.5 + Kimi K2.5 pricing.
- feat(pricing): xAI Grok-4 Fast —
$0.20/$0.50 per 1M tokens, 1143ms p50 latency, tool calling supported - feat(pricing): xAI Grok-4 (standard) —
$0.20/$1.50 per 1M tokens, reasoning flagship - feat(pricing): GLM-5 via Z.AI —
$0.5/1M, 128K output context - feat(pricing): MiniMax M2.5 —
$0.30/1M input, reasoning + agentic tasks - feat(pricing): DeepSeek V3.2 — updated pricing
$0.27/$1.10 per 1M - feat(pricing): Kimi K2.5 via Moonshot API — direct Moonshot API access
- feat(providers): Z.AI provider added (
zaialias) — GLM-5 family with 128K output
- feat(registry):
toolCallingflag per model in provider registry — combos can now prefer/require tool-calling capable models - feat(scoring): Multilingual intent detection for AutoCombo scoring — PT/ZH/ES/AR script/language patterns influence model selection per request context
- feat(fallback): Benchmark-driven fallback chains — real latency data (p50 from
comboMetrics) used to re-order fallback priority dynamically - feat(dedup): Request deduplication via content-hash — 5-second idempotency window prevents duplicate provider calls from retrying clients
- feat(router): Pluggable
RouterStrategyinterface inautoCombo/routerStrategy.ts— custom routing logic can be injected without modifying core
- feat(mcp): 2 new advanced tool schemas:
omniroute_get_provider_metrics(p50/p95/p99 per provider) andomniroute_explain_route(routing decision explanation) - feat(mcp): MCP tool auth scopes updated —
metrics:readscope added for provider metrics tools - feat(mcp):
omniroute_best_combo_for_tasknow acceptslanguageHintparameter for multilingual routing
- feat(metrics):
comboMetrics.tsextended with real-time latency percentile tracking per provider/account - feat(health): Health API (
/api/monitoring/health) now returns per-providerp50LatencyanderrorRatefields - feat(usage): Usage history migration for per-model latency tracking
- feat(migrations): New column
latency_p50incombo_metricstable — zero-breaking, safe for existing users
- close(#411): better-sqlite3 hashed module resolution on Windows — fixed in v2.6.10 (f02c5b5)
- close(#409): GitHub Copilot chat completions fail with Claude models when files attached — fixed in v2.6.9 (838f1d6)
- close(#405): Duplicate of #411 — resolved
Windows fix: better-sqlite3 prebuilt download without node-gyp/Python/MSVC (#426).
- fix(install/#426): On Windows,
npm install -g omnirouteused to fail withbetter_sqlite3.node is not a valid Win32 applicationbecause the bundled native binary was compiled for Linux. Adds Strategy 1.5 toscripts/postinstall.mjs: uses@mapbox/node-pre-gyp install --fallback-to-build=false(bundled withinbetter-sqlite3) to download the correct prebuilt binary for the current OS/arch without requiring any build tools (no node-gyp, no Python, no MSVC). Falls back tonpm rebuildonly if the download fails. Adds platform-specific error messages with clear manual fix instructions.
CI fixes (t11 any-budget), bug fix #409 (file attachments via Copilot+Claude), release workflow correction.
- fix(ci): Remove word "any" from comments in
openai-responses.tsandchatCore.tsthat were failing the t11\bany\bbudget check (false positive from regex counting comments) - fix(chatCore): Normalize unsupported content part types before forwarding to providers (#409 — Cursor sends
{type:"file"}when.mdfiles are attached; Copilot and other OpenAI-compat providers reject with "type has to be either 'image_url' or 'text'"; fix convertsfile/documentblocks totextand drops unknown types)
- chore(generate-release): Add ATOMIC COMMIT RULE — version bump (
npm version patch) MUST happen before committing feature files to ensure tag always points to a commit containing all version changes together
Sprint: Combo as Agent (system prompt + tool filter), Context Caching Protection, Auto-Update, Detailed Logs, MITM Kiro IDE.
- 005_combo_agent_fields.sql:
ALTER TABLE combos ADD COLUMN system_message TEXT DEFAULT NULL,tool_filter_regex TEXT DEFAULT NULL,context_cache_protection INTEGER DEFAULT 0 - 006_detailed_request_logs.sql: New
request_detail_logstable with 500-entry ring-buffer trigger, opt-in via settings toggle
- feat(combo): System Message Override per Combo (#399 —
system_messagefield replaces or injects system prompt before forwarding to provider) - feat(combo): Tool Filter Regex per Combo (#399 —
tool_filter_regexkeeps only tools matching pattern; supports OpenAI + Anthropic formats) - feat(combo): Context Caching Protection (#401 —
context_cache_protectiontags responses with<omniModel>provider/model</omniModel>and pins model for session continuity) - feat(settings): Auto-Update via Settings (#320 —
GET /api/system/version+POST /api/system/update— checks npm registry and updates in background with pm2 restart) - feat(logs): Detailed Request Logs (#378 — captures full pipeline bodies at 4 stages: client request, translated request, provider response, client response — opt-in toggle, 64KB trim, 500-entry ring-buffer)
- feat(mitm): MITM Kiro IDE profile (#336 —
src/mitm/targets/kiro.tstargets api.anthropic.com, reuses existing MITM infrastructure)
Sprint: SSE improvements, local provider_nodes extensions, proxy registry, Claude passthrough fixes.
- feat(health): Background health check for local
provider_nodeswith exponential backoff (30s→300s) andPromise.allSettledto avoid blocking (#423, @Regis-RCR) - feat(embeddings): Route
/v1/embeddingsto localprovider_nodes—buildDynamicEmbeddingProvider()with hostname validation (#422, @Regis-RCR) - feat(audio): Route TTS/STT to local
provider_nodes—buildDynamicAudioProvider()with SSRF protection (#416, @Regis-RCR) - feat(proxy): Proxy registry, management APIs, and quota-limit generalization (#429, @Regis-RCR)
- fix(sse): Strip Claude-specific fields (
metadata,anthropic_version) when target is OpenAI-compat (#421, @prakersh) - fix(sse): Extract Claude SSE usage (
input_tokens,output_tokens, cache tokens) in passthrough stream mode (#420, @prakersh) - fix(sse): Generate fallback
call_idfor tool calls with missing/empty IDs (#419, @prakersh) - fix(sse): Claude-to-Claude passthrough — forward body completely untouched, no re-translation (#418, @prakersh)
- fix(sse): Filter orphaned
tool_resultitems after Claude Code context compaction to avoid 400 errors (#417, @prakersh) - fix(sse): Skip empty-name tool calls in Responses API translator to prevent
placeholder_toolinfinite loops (#415, @prakersh) - fix(sse): Strip empty text content blocks before translation (#427, @prakersh)
- fix(api): Add
refreshable: trueto Claude OAuth test config (#428, @prakersh)
- Bump
vitest,@vitest/*and related devDependencies (#414, @dependabot)
Hotfix: Turbopack/Docker compatibility — remove
node:protocol from allsrc/imports.
- fix(build): Removed
node:protocol prefix fromimportstatements in 17 files undersrc/. Thenode:fs,node:path,node:url,node:osetc. imports causedEcmascript file had an erroron Turbopack builds (Next.js 15 Docker) and on upgrades from older npm global installs. Affected files:migrationRunner.ts,core.ts,backup.ts,prompts.ts,dataPaths.ts, and 12 others insrc/app/api/andsrc/lib/. - chore(workflow): Updated
generate-release.mdto make Docker Hub sync and dual-VPS deploy mandatory steps in every release.
Sprint: reasoning model param filtering, local provider 404 fix, Kilo Gateway provider, dependency bumps.
- feat(api): Added Kilo Gateway (
api.kilo.ai) as a new API Key provider (aliaskg) — 335+ models, 6 free models, 3 auto-routing models (kilo-auto/frontier,kilo-auto/balanced,kilo-auto/free). Passthrough models supported via/api/gateway/modelsendpoint. (PR #408 by @Regis-RCR)
- fix(sse): Strip unsupported parameters for reasoning models (o1, o1-mini, o1-pro, o3, o3-mini). Models in the
o1/o3family rejecttemperature,top_p,frequency_penalty,presence_penalty,logprobs,top_logprobs, andnwith HTTP 400. Parameters are now stripped at thechatCorelayer before forwarding. Uses a declarativeunsupportedParamsfield per model and a precomputed O(1) Map for lookup. (PR #412 by @Regis-RCR) - fix(sse): Local provider 404 now results in a model-only lockout (5 seconds) instead of a connection-level lockout (2 minutes). When a local inference backend (Ollama, LM Studio, oMLX) returns 404 for an unknown model, the connection remains active and other models continue working immediately. Also fixes a pre-existing bug where
modelwas not passed tomarkAccountUnavailable(). Local providers detected via hostname (localhost,127.0.0.1,::1, extensible viaLOCAL_HOSTNAMESenv var). (PR #410 by @Regis-RCR)
better-sqlite312.6.2 → 12.8.0undici7.24.2 → 7.24.4https-proxy-agent7 → 8agent-base7 → 8
- fix(providers): Removed non-existent model names across 5 providers:
- gemini / gemini-cli: removed
gemini-3.1-pro/flashandgemini-3-*-preview(don't exist in Google API v1beta); replaced withgemini-2.5-pro,gemini-2.5-flash,gemini-2.0-flash,gemini-1.5-pro/flash - antigravity: removed
gemini-3.1-pro-high/lowandgemini-3-flash(invalid internal aliases); replaced with real 2.x models - github (Copilot): removed
gemini-3-flash-previewandgemini-3-pro-preview; replaced withgemini-2.5-flash - nvidia: corrected
nvidia/llama-3.3-70b-instruct→meta/llama-3.3-70b-instruct(NVIDIA NIM usesmeta/namespace for Meta models); addednvidia/llama-3.1-70b-instructandnvidia/llama-3.1-405b-instruct
- gemini / gemini-cli: removed
- fix(db/combo): Updated
free-stackcombo on remote DB: removedqw/qwen3-coder-plus(expired refresh token), correctednvidia/llama-3.3-70b-instruct→nvidia/meta/llama-3.3-70b-instruct, correctedgemini/gemini-3.1-flash→gemini/gemini-2.5-flash, addedif/deepseek-v3.2
Sprint: zod/pino hash-strip baked into build pipeline, Synthetic provider added, VPS PM2 path corrected.
- fix(build): Turbopack hash-strip now runs at compile time for ALL packages — not just
better-sqlite3. Step 5.6 inprepublish.mjswalks every.jsinapp/.next/server/and strips the 16-char hex suffix from any hashedrequire(). Fixeszod-dcb22c...,pino-..., etc. MODULE_NOT_FOUND on global npm installs. Closes #398 - fix(deploy): PM2 on both VPS was pointing to stale git-clone directories. Reconfigured to
app/server.jsin the npm global package. Updated/deploy-vpsworkflow to usenpm pack + scp(npm registry rejects 299MB packages).
- feat(provider): Synthetic (synthetic.new) — privacy-focused OpenAI-compatible inference.
passthroughModels: truefor dynamic HuggingFace model catalog. Initial models: Kimi K2.5, MiniMax M2.5, GLM 4.7, DeepSeek V3.2. (PR #404 by @Regis-RCR)
- close #398: npm hash regression — fixed by compile-time hash-strip in prepublish
- triage #324: Bug screenshot without steps — requested reproduction details
Sprint: module hashing fully fixed, 2 PRs merged (Anthropic tools filter + custom endpoint paths), Alibaba Cloud DashScope provider added, 3 stale issues closed.
- fix(build): Extended webpack
externalshash-strip to cover ALLserverExternalPackages, not justbetter-sqlite3. Next.js 16 Turbopack hasheszod,pino, and every other server-external package into names likezod-dcb22c6336e0bc69that don't exist innode_modulesat runtime. A HASH_PATTERN regex catch-all now strips the 16-char suffix and falls back to the base package name. Also addedNEXT_PRIVATE_BUILD_WORKER=0inprepublish.mjsto reinforce webpack mode, plus a post-build scan that reports any remaining hashed refs. (#396, #398, PR #403) - fix(chat): Anthropic-format tool names (
tool.namewithout.functionwrapper) were silently dropped by the empty-name filter introduced in #346. LiteLLM proxies requests withanthropic/prefix in Anthropic Messages API format, causing all tools to be filtered and Anthropic to return400: tool_choice.any may only be specified while providing tools. Fixed by falling back totool.namewhentool.function.nameis absent. Added 8 regression unit tests. (PR #397)
- feat(api): Custom endpoint paths for OpenAI-compatible provider nodes — configure
chatPathandmodelsPathper node (e.g./v4/chat/completions) in the provider connection UI. Includes a DB migration (003_provider_node_custom_paths.sql) and URL path sanitization (no..traversal, must start with/). (PR #400) - feat(provider): Alibaba Cloud DashScope added as OpenAI-compatible provider. International endpoint:
dashscope-intl.aliyuncs.com/compatible-mode/v1. 12 models:qwen-max,qwen-plus,qwen-turbo,qwen3-coder-plus/flash,qwq-plus,qwq-32b,qwen3-32b,qwen3-235b-a22b. Auth: Bearer API key.
- close #323: Cline connection error
[object Object]— fixed in v2.3.7; instructed user to upgrade from v2.2.9 - close #337: Kiro credit tracking — implemented in v2.5.5 (#381); pointed user to Dashboard → Usage
- triage #402: ARM64 macOS DMG damaged — requested macOS version, exact error, and advised
xattr -d com.apple.quarantineworkaround
Critical startup fix: v2.6.0 global npm installs crashed with a 500 error due to a Turbopack/webpack module-name hashing bug in the Next.js 16 instrumentation hook.
- fix(build): Force
better-sqlite3to always be required by its exact package name in the webpack server bundle. Next.js 16 compiled the instrumentation hook into a separate chunk and emittedrequire('better-sqlite3-<hash>')— a hashed module name that doesn't exist innode_modules— even though the package was listed inserverExternalPackages. Added an explicitexternalsfunction to the server webpack config so the bundler always emitsrequire('better-sqlite3'), resolving the startup500 Internal Server Erroron clean global installs. (#394, PR #395)
- ci: Added
workflow_dispatchtonpm-publish.ymlwith version sync safeguard for manual triggers (#392) - ci: Added
workflow_dispatchtodocker-publish.yml, updated GitHub Actions to latest versions (#392)
Issue resolution sprint: 4 bugs fixed, logs UX improved, Kiro credit tracking added.
- fix(media): ComfyUI and SD WebUI no longer appear in the Media page provider list when unconfigured — fetches
/api/providerson mount and hides local providers with no connections (#390) - fix(auth): Round-robin no longer re-selects rate-limited accounts immediately after cooldown —
backoffLevelis now used as primary sort key in the LRU rotation (#340) - fix(oauth): iFlow (and other providers that redirect to their own UI) no longer leave the OAuth modal stuck at "Waiting for Authorization" — popup-closed detector auto-transitions to manual URL input mode (#344)
- fix(logs): Request log table is now readable in light mode — status badges, token counts, and combo tags use adaptive
dark:color classes (#378)
- feat(kiro): Kiro credit tracking added to usage fetcher — queries
getUserCreditsfrom AWS CodeWhisperer endpoint (#337)
- chore(tests): Aligned
test:plan3,test:fixes,test:securityto use sametsx/esmloader asnpm test— eliminates module resolution false negatives in targeted runs (PR #386)
Codex native passthrough fix + route body validation hardening.
- fix(codex): Preserve native Responses API passthrough for Codex clients — avoids unnecessary translation mutations (PR #387)
- fix(api): Validate request bodies on pricing/sync and task-routing routes — prevents crashes from malformed inputs (PR #388)
- fix(auth): JWT secrets persist across restarts via
src/lib/db/secrets.ts— eliminates 401 errors after pm2 restart (PR #388)
Build fix: restore VPS connectivity broken by v2.5.7 incomplete publish.
- fix(build):
scripts/prepublish.mjsstill used deprecated--webpackflag causing Next.js standalone build to fail silently — npm publish completed withoutapp/server.js, breaking VPS deployment
Media playground error handling fixes.
- fix(media): Transcription "API Key Required" false positive when audio contains no speech (music, silence) — now shows "No speech detected" instead
- fix(media):
upstreamErrorResponseinaudioTranscription.tsandaudioSpeech.tsnow returns proper JSON ({error:{message}}), enabling correct 401/403 credential error detection in the MediaPageClient - fix(media):
parseApiErrornow handles Deepgram'serr_msgfield and detects"api key"in error messages for accurate credential error classification
Critical security/auth fixes: Antigravity OAuth broken + JWT sessions lost after restart.
- fix(oauth) #384: Antigravity Google OAuth now correctly sends
client_secretto the token endpoint. The fallback forANTIGRAVITY_OAUTH_CLIENT_SECRETwas an empty string, which is falsy — soclient_secretwas never included in the request, causing"client_secret is missing"errors for all users without a custom env var. Closes #383. - fix(auth) #385:
JWT_SECRETis now persisted to SQLite (namespace='secrets') on first generation and reloaded on subsequent starts. Previously, a new random secret was generated each process startup, invalidating all existing cookies/sessions after any restart or upgrade. Affects bothJWT_SECRETandAPI_KEY_SECRET. Closes #382.
Model list dedup fix, Electron standalone build hardening, and Kiro credit tracking.
- fix(models) #380:
GET /api/modelsnow includes provider aliases when building the active-provider filter — models forclaude(aliascc) andgithub(aliasgh) were always shown regardless of whether a connection was configured, becausePROVIDER_MODELSkeys are aliases but DB connections are stored under provider IDs. Fixed by expanding each active provider ID to also include its alias viaPROVIDER_ID_TO_ALIAS. Closes #353. - fix(electron) #379: New
scripts/prepare-electron-standalone.mjsstages a dedicated/.next/electron-standalonebundle before Electron packaging. Aborts with a clear error ifnode_modulesis a symlink (electron-builder would ship a runtime dependency on the build machine). Cross-platform path sanitization viapath.basename. By @kfiramar.
- feat(kiro) #381: Kiro credit balance tracking — usage endpoint now returns credit data for Kiro accounts by calling
codewhisperer.us-east-1.amazonaws.com/getUserCredits(same endpoint Kiro IDE uses internally). Returns remaining credits, total allowance, renewal date, and subscription tier. Closes #337.
Logger startup fix, login bootstrap security fix, and dev HMR reliability improvement. CI infrastructure hardened.
- fix(logger) #376: Restore pino transport logger path —
formatters.levelcombined withtransport.targetsis rejected by pino. Transport-backed configs now strip the level formatter viagetTransportCompatibleConfig(). Also corrects numeric level mapping in/api/logs/console:30→info, 40→warn, 50→error(was shifted by one). - fix(login) #375: Login page now bootstraps from the public
/api/settings/require-loginendpoint instead of the protected/api/settings. In password-protected setups, the pre-auth page was receiving a 401 and falling back to safe defaults unnecessarily. The public route now returns all bootstrap metadata (requireLogin,hasPassword,setupComplete) with a conservative 200 fallback on error. - fix(dev) #374: Add
localhostand127.0.0.1toallowedDevOriginsinnext.config.mjs— HMR websocket was blocked when accessing the app via loopback address, producing repeated cross-origin warnings.
- ESLint OOM fix:
eslint.config.mjsnow ignoresvscode-extension/**,electron/**,docs/**,app/.next/**, andclipr/**— ESLint was crashing with a JS heap OOM by scanning VS Code binary blobs and compiled chunks. - Unit test fix: Removed stale
ALTER TABLE provider_connections ADD COLUMN "group"from 2 test files — column is now part of the base schema (added in #373), causingSQLITE_ERROR: duplicate column nameon every CI run. - Pre-commit hook: Added
npm run test:unitto.husky/pre-commit— unit tests now block broken commits before they reach CI.
Critical bugfixes: DB schema migration, startup env loading, provider error state clearing, and i18n tooltip fix. Code quality improvements on top of each PR.
- fix(db) #373: Add
provider_connections.groupcolumn to base schema + backfill migration for existing databases — column was used in all queries but missing from schema definition - fix(i18n) #371: Replace non-existent
t("deleteConnection")key with existingproviders.deletekey — fixesMISSING_MESSAGE: providers.deleteConnectionruntime error on provider detail page - fix(auth) #372: Clear stale error metadata (
errorCode,lastErrorType,lastErrorSource) from provider accounts after genuine recovery — previously, recovered accounts kept appearing as failed - fix(startup) #369: Unify env loading across
npm run start,run-standalone.mjs, and Electron to respectDATA_DIR/.env → ~/.omniroute/.env → ./.envpriority — prevents generating a newSTORAGE_ENCRYPTION_KEYover an existing encrypted database
- Documented
result.successvsresponse?.okpatterns inauth.ts(both intentional, now explained) - Normalized
overridePath?.trim()inelectron/main.jsto matchbootstrap-env.mjs - Added
preferredEnvmerge order comment in Electron startup
Codex account quota policy with auto-rotation, fast tier toggle, gpt-5.4 model, and analytics label fix.
- Codex Quota Policy (PR #366): Per-account 5h/weekly quota window toggles in Provider dashboard. Accounts are automatically skipped when enabled windows reach 90% threshold and re-admitted after
resetAt. IncludesquotaCache.tswith side-effect free status getter. - Codex Fast Tier Toggle (PR #367): Dashboard → Settings → Codex Service Tier. Default-off toggle injects
service_tier: "flex"only for Codex requests, reducing cost ~80%. Full stack: UI tab + API endpoint + executor + translator + startup restore. - gpt-5.4 Model (PR #368): Adds
cx/gpt-5.4andcodex/gpt-5.4to the Codex model registry. Regression test included.
- fix #356: Analytics charts (Top Provider, By Account, Provider Breakdown) now display human-readable provider names/labels instead of raw internal IDs for OpenAI-compatible providers.
Major release: strict-random routing strategy, API key access controls, connection groups, external pricing sync, and critical bug fixes for thinking models, combo testing, and tool name validation.
- Strict-Random Routing Strategy: Fisher-Yates shuffle deck with anti-repeat guarantee and mutex serialization for concurrent requests. Independent decks per combo and per provider.
- API Key Access Controls:
allowedConnections(restrict which connections a key can use),is_active(enable/disable key with 403),accessSchedule(time-based access control),autoResolvetoggle, rename keys via PATCH. - Connection Groups: Group provider connections by environment. Accordion view in Limits page with localStorage persistence and smart auto-switch.
- External Pricing Sync (LiteLLM): 3-tier pricing resolution (user overrides → synced → defaults). Opt-in via
PRICING_SYNC_ENABLED=true. MCP toolomniroute_sync_pricing. 23 new tests. - i18n: 30 languages updated with strict-random strategy, API key management strings. pt-BR fully translated.
- fix #355: Stream idle timeout increased from 60s to 300s — prevents aborting extended-thinking models (claude-opus-4-6, o3, etc.) during long reasoning phases. Configurable via
STREAM_IDLE_TIMEOUT_MS. - fix #350: Combo test now bypasses
REQUIRE_API_KEY=trueusing internal header, and uses OpenAI-compatible format universally. Timeout extended from 15s to 20s. - fix #346: Tools with empty
function.name(forwarded by Claude Code) are now filtered before upstream providers receive them, preventing "Invalid input[N].name: empty string" errors.
- #341: Debug section removed — replacement is
/dashboard/logsand/dashboard/health.
API Key Round-Robin support for multi-key provider setups, and confirmation of wildcard routing and quota window rolling already in place.
- API Key Round-Robin (T07): Provider connections can now hold multiple API keys (Edit Connection → Extra API Keys). Requests rotate round-robin between primary + extra keys via
providerSpecificData.extraApiKeys[]. Keys are held in-memory indexed per connection — no DB schema changes required.
- Wildcard Model Routing (T13):
wildcardRouter.tswith glob-style wildcard matching (gpt*,claude-?-sonnet, etc.) is already integrated intomodel.tswith specificity ranking. - Quota Window Rolling (T08):
accountFallback.ts:isModelLocked()already auto-advances the window — ifDate.now() > entry.until, lock is deleted immediately (no stale blocking).
UI polish, routing strategy additions, and graceful error handling for usage limits.
- Fill-First & P2C Routing Strategies: Added
fill-first(drain quota before moving on) andp2c(Power-of-Two-Choices low-latency selection) to combo strategy picker, with full guidance panels and color-coded badges. - Free Stack Preset Models: Creating a combo with the Free Stack template now auto-fills 7 best-in-class free provider models (Gemini CLI, Kiro, iFlow×2, Qwen, NVIDIA NIM, Groq). Users just activate the providers and get a $0/month combo out-of-the-box.
- Wider Combo Modal: Create/Edit combo modal now uses
max-w-4xlfor comfortable editing of large combos.
- Limits page HTTP 500 for Codex & GitHub:
getCodexUsage()andgetGitHubUsage()now return a user-friendly message when the provider returns 401/403 (expired token), instead of throwing and causing a 500 error on the Limits page. - MaintenanceBanner false-positive: Banner no longer shows "Server is unreachable" spuriously on page load. Fixed by calling
checkHealth()immediately on mount and removing staleshow-state closure. - Provider icon tooltips: Edit (pencil) and delete icon buttons in the provider connection row now have native HTML tooltips — all 6 action icons are now self-documented.
Multiple improvements from community issue analysis, new provider support, bug fixes for token tracking, model routing, and streaming reliability.
- Task-Aware Smart Routing (T05): Automatic model selection based on request content type — coding → deepseek-chat, analysis → gemini-2.5-pro, vision → gpt-4o, summarization → gemini-2.5-flash. Configurable via Settings. New
GET/PUT/POST /api/settings/task-routingAPI. - HuggingFace Provider: Added HuggingFace Router as an OpenAI-compatible provider with Llama 3.1 70B/8B, Qwen 2.5 72B, Mistral 7B, Phi-3.5 Mini.
- Vertex AI Provider: Added Vertex AI (Google Cloud) provider with Gemini 2.5 Pro/Flash, Gemma 2 27B, Claude via Vertex.
- Playground File Uploads: Audio upload for transcription, image upload for vision models (auto-detect by model name), inline image rendering for image generation results.
- Model Select Visual Feedback: Already-added models in combo picker now show ✓ green badge — prevents duplicate confusion.
- Qwen Compatibility (PR #352): Updated User-Agent and CLI fingerprint settings for Qwen provider compatibility.
- Round-Robin State Management (PR #349): Enhanced round-robin logic to handle excluded accounts and maintain rotation state correctly.
- Clipboard UX (PR #360): Hardened clipboard operations with fallback for non-secure contexts; Claude tool normalization improvements.
- Fix #302 — OpenAI SDK stream=False drops tool_calls: T01 Accept header negotiation no longer forces streaming when
body.streamis explicitlyfalse. Was causing tool_calls to be silently dropped when using the OpenAI Python SDK in non-streaming mode. - Fix #73 — Claude Haiku routed to OpenAI without provider prefix:
claude-*models sent without a provider prefix now correctly route to theantigravity(Anthropic) provider. Addedgemini-*/gemma-*→geminiheuristic as well. - Fix #74 — Token counts always 0 for Antigravity/Claude streaming: The
message_startSSE event which carriesinput_tokenswas not being parsed byextractUsage(), causing all input token counts to drop. Input/output token tracking now works correctly for streaming responses. - Fix #180 — Model import duplicates with no feedback:
ModelSelectModalnow shows ✓ green highlight for models already in the combo, making it obvious they're already added. - Media page generation errors: Image results now render as
<img>tags instead of raw JSON. Transcription results shown as readable text. Credential errors show an amber banner instead of silent failure. - Token refresh button on provider page: Manual token refresh UI added for OAuth providers.
- Provider Registry: HuggingFace and Vertex AI added to
providerRegistry.tsandproviders.ts(frontend). - Read Cache: New
src/lib/db/readCache.tsfor efficient DB read caching. - Quota Cache: Improved quota cache with TTL-based eviction.
dompurify→ 3.3.3 (PR #347)undici→ 7.24.2 (PR #348, #361)docker/setup-qemu-action→ v4 (PR #342)docker/setup-buildx-action→ v4 (PR #343)
| File | Purpose |
|---|---|
open-sse/services/taskAwareRouter.ts |
Task-aware routing logic (7 task types) |
src/app/api/settings/task-routing/route.ts |
Task routing config API |
src/app/api/providers/[id]/refresh/route.ts |
Manual OAuth token refresh |
src/lib/db/readCache.ts |
Efficient DB read cache |
src/shared/utils/clipboard.ts |
Hardened clipboard with fallback |
- Combos modal: Free Stack visible and prominent — Free Stack template was hidden (4th in 3-column grid). Fixed: moved to position 1, switched to 2x2 grid so all 4 templates are visible, green border + FREE badge highlight.
Major release — Free Stack ecosystem, transcription playground overhaul, 44+ providers, comprehensive free tier documentation, and UI improvements across the board.
- Combos: Free Stack template — New 4th template "Free Stack ($0)" using round-robin across Kiro + iFlow + Qwen + Gemini CLI. Suggests the pre-built zero-cost combo on first use.
- Media/Transcription: Deepgram as default — Deepgram (Nova 3, $200 free) is now the default transcription provider. AssemblyAI ($50 free) and Groq Whisper (free forever) shown with free credit badges.
- README: "Start Free" section — New early-README 5-step table showing how to set up zero-cost AI in minutes.
- README: Free Transcription Combo — New section with Deepgram/AssemblyAI/Groq combo suggestion and per-provider free credit details.
- providers.ts: hasFree flag — NVIDIA NIM, Cerebras, and Groq marked with hasFree badge and freeNote for the providers UI.
- i18n: templateFreeStack keys — Free Stack combo template translated and synced to all 30 languages.
- README: 44+ Providers — Updated all 3 occurrences of "36+ providers" to "44+" reflecting the actual codebase count (44 providers in providers.ts)
- README: New Section "🆓 Free Models — What You Actually Get" — Added 7-provider table with per-model rate limits for: Kiro (Claude unlimited via AWS Builder ID), iFlow (5 models unlimited), Qwen (4 models unlimited), Gemini CLI (180K/mo), NVIDIA NIM (~40 RPM dev-forever), Cerebras (1M tok/day / 60K TPM), Groq (30 RPM / 14.4K RPD). Includes the /usr/bin/bash Ultimate Free Stack combo recommendation.
- README: Pricing Table Updated — Added Cerebras to API KEY tier, fixed NVIDIA from "1000 credits" to "dev-forever free", updated iFlow/Qwen model counts and names
- README: iFlow 8→5 models (named: kimi-k2-thinking, qwen3-coder-plus, deepseek-r1, minimax-m2, kimi-k2)
- README: Qwen 3→4 models (named: qwen3-coder-plus, qwen3-coder-flash, qwen3-coder-next, vision-model)
- Auto-Combo Dashboard (Tier Priority): Added
🏷️ Tieras the 7th scoring factor label in the/dashboard/auto-combofactor breakdown display — all 7 Auto-Combo scoring factors are now visible. - i18n — autoCombo section: Added 20 new translation keys for the Auto-Combo dashboard (
title,status,modePack,providerScores,factorTierPriority, etc.) to all 30 language files.
- iFlow OAuth (#339): Restored the valid default
clientSecret— was previously an empty string, causing "Bad client credentials" on every connect attempt. The public credential is now the default fallback (overridable viaIFLOW_OAUTH_CLIENT_SECRETenv var). - MITM server not found (#335):
prepublish.mjsnow compilessrc/mitm/*.tsto JavaScript usingtscbefore copying to the npm bundle. Previously only raw.tsfiles were copied — meaningserver.jsnever existed in npm/Volta global installs. - GeminiCLI missing projectId (#338): Instead of throwing a hard 500 error when
projectIdis missing from stored credentials (e.g. after Docker restart), OmniRoute now logs a warning and attempts the request — returning a meaningful provider-side error instead of an OmniRoute crash. - Electron version mismatch (#323): Synced
electron/package.jsonversion to2.3.13(was2.0.13) so the desktop binary version matches the npm package.
- Kiro:
claude-sonnet-4,claude-opus-4.6,deepseek-v3.2,minimax-m2.1,qwen3-coder-next,auto - Codex:
gpt5.4
- Tier Scoring (API + Validation): Added
tierPriority(weight0.05) to theScoringWeightsZod schema and thecombos/autoAPI route — the 7th scoring factor is now fully accepted by the REST API and validated on input.stabilityweight adjusted from0.10to0.05to keep total sum =1.0.
- Tiered Quota Scoring (Auto-Combo): Added
tierPriorityas a 7th scoring factor — accounts with Ultra/Pro tiers are now preferred over Free tiers when other factors are equal. New optional fieldsaccountTierandquotaResetIntervalSecsonProviderCandidate. All 4 mode packs updated (ship-fast,cost-saver,quality-first,offline-friendly). - Intra-Family Model Fallback (T5): When a model is unavailable (404/400/403), OmniRoute now automatically falls back to sibling models from the same family before returning an error (
modelFamilyFallback.ts). - Configurable API Bridge Timeout:
API_BRIDGE_PROXY_TIMEOUT_MSenv var lets operators tune the proxy timeout (default 30s). Fixes 504 errors on slow upstream responses. (#332) - Star History: Replaced star-history.com widget with starchart.cc (
?variant=adaptive) in all 30 READMEs — adapts to light/dark theme, real-time updates.
- Auth — First-time password:
INITIAL_PASSWORDenv var is now accepted when setting the first dashboard password. UsestimingSafeEqualfor constant-time comparison, preventing timing attacks. (#333) - README Truncation: Fixed a missing
</details>closing tag in the Troubleshooting section that caused GitHub to stop rendering everything below it (Tech Stack, Docs, Roadmap, Contributors). - pnpm install: Removed redundant
@swc/helpersoverride frompackage.jsonthat conflicted with the direct dependency, causingEOVERRIDEerrors on pnpm. Addedpnpm.onlyBuiltDependenciesconfig. - CLI Path Injection (T12): Added
isSafePath()validator incliRuntime.tsto block path traversal and shell metacharacters inCLI_*_BINenv vars. - CI: Regenerated
package-lock.jsonafter override removal to fixnpm cifailures on GitHub Actions.
- Response Format (T1):
response_format(json_schema/json_object) now injected as a system prompt for Claude, enabling structured output compatibility. - 429 Retry (T2): Intra-URL retry for 429 responses (2× attempts with 2s delay) before falling back to next URL.
- Gemini CLI Headers (T3): Added
User-AgentandX-Goog-Api-Clientfingerprint headers for Gemini CLI compatibility. - Pricing Catalog (T9): Added
deepseek-3.1,deepseek-3.2, andqwen3-coder-nextpricing entries.
| File | Purpose |
|---|---|
open-sse/services/modelFamilyFallback.ts |
Model family definitions and intra-family fallback logic |
- KiloCode: kilocode healthcheck timeout already fixed in v2.3.11
- OpenCode: Add opencode to cliRuntime registry with 15s healthcheck timeout
- OpenClaw / Cursor: Increase healthcheck timeout to 15s for slow-start variants
- VPS: Install droid and openclaw npm packages; activate CLI_EXTRA_PATHS for kiro-cli
- cliRuntime: Add opencode tool registration and increase timeout for continue
- KiloCode healthcheck: Increase
healthcheckTimeoutMsfrom 4000ms to 15000ms — kilocode renders an ASCII logo banner on startup causing falsehealthcheck_failedon slow/cold-start environments
- Lint: Fix
check:any-budget:t11failure — replaceas anywithas Record<string, unknown>in OAuthModal.tsx (3 occurrences)
- CLI-TOOLS.md: Complete guide for all 11 CLI tools (claude, codex, gemini, opencode, cline, kilocode, continue, kiro-cli, cursor, droid, openclaw)
- i18n: CLI-TOOLS.md synced to 30 languages with translated title + intro
- /v1/completions: New legacy OpenAI completions endpoint — accepts both
promptstring andmessagesarray, normalizes to chat format automatically - EndpointPage: Now shows all 3 OpenAI-compatible endpoint types: Chat Completions, Responses API, and Legacy Completions
- i18n: Added
completionsLegacy/completionsLegacyDescto 30 language files
- OAuthModal: Fix
[object Object]displayed on all OAuth connection errors — properly extract.messagefrom error response objects in all 3throw new Error(data.error)calls (exchange, device-code, authorize) - Affects Cline, Codex, GitHub, Qwen, Kiro, and all other OAuth providers
- Cline OAuth: Add
decodeURIComponentbefore base64 decode so URL-encoded auth codes from the callback URL are parsed correctly, fixing "invalid or expired authorization code" errors on remote (LAN IP) setups - Cline OAuth:
mapTokensnow populatesname = firstName + lastName || emailso Cline accounts show real user names instead of "Account #ID" - OAuth account names: All OAuth exchange flows (exchange, poll, poll-callback) now normalize
name = emailwhen name is missing, so every OAuth account shows its email as the display label in the Providers dashboard - OAuth account names: Removed sequential "Account N" fallback in
db/providers.ts— accounts with no email/name now use a stable ID-based label viagetAccountDisplayName()instead of a sequential number that changes when accounts are deleted
- Provider test batch: Fixed Zod schema to accept
providerId: null(frontend sends null for non-provider modes); was incorrectly returning "Invalid request" for all batch tests - Provider test modal: Fixed
[object Object]display by normalizing API error objects to strings before rendering insetTestResultsandProviderTestResultsView - i18n: Added missing keys
cliTools.toolDescriptions.opencode,cliTools.toolDescriptions.kiro,cliTools.guides.opencode,cliTools.guides.kirotoen.json - i18n: Synchronized 1111 missing keys across all 29 non-English language files using English values as fallbacks
- @swc/helpers: Added permanent
postinstallfix to copy@swc/helpersinto the standalone app'snode_modules— prevents MODULE_NOT_FOUND crash on global npm installs
- Multiple provider integrations and dashboard improvements