feat(sse): add adaptive volume/complexity detector for routing strategy override#794
Conversation
…gy override Add volumeDetector module that analyzes request characteristics (batch size, token count, tool usage, browser signals, complexity keywords) and recommends routing strategy overrides. Rules: - Batch >= 50 items → round-robin with economy models - Critical complexity (many tools, browser, deploy) → priority premium-first - Browser/UI interaction → force premium priority - Short requests (<200 tokens) → flag for economy tier Closes diegosouzapw#789
|
|
||
| // Estimate tokens from serialized message size | ||
| const serialized = JSON.stringify(messages); | ||
| const estimatedTokens = Math.ceil(serialized.length / 4); // rough: 4 chars ≈ 1 token |
There was a problem hiding this comment.
WARNING: Token estimation is overly simplistic. Using serialized.length / 4 is a rough approximation that can be significantly off for:
- Code content (tokens are often larger per character)
- Non-English text
- Special characters and formatting
This could lead to incorrect routing decisions for requests that fall near complexity thresholds. Consider using a proper tokenizer library or at least documented worst-case assumptions.
| const hasBrowser = BROWSER_KEYWORDS.some((kw) => lowerSerialized.includes(kw)); | ||
|
|
||
| // Detect image content | ||
| const hasImages = |
There was a problem hiding this comment.
WARNING: Image detection via string matching can produce false positives:
- "base64" will match any message containing the word "base64" (e.g., "The system uses base64 encoding")
- "screenshot" will match text discussing screenshots without actual image data
- "image_url" could match documentation or variable names
This may cause hasImages to be true when no actual image content is present, potentially routing to premium models unnecessarily.
| * @returns Extracted signals | ||
| */ | ||
| export function detectVolumeSignals(body: Record<string, unknown>): VolumeSignals { | ||
| const messages = (body.messages || body.input || []) as unknown[]; |
There was a problem hiding this comment.
SUGGESTION: Type safety issue - body.messages is cast to unknown[] without validation. If the API passes a string (e.g., from some providers that use a string for single messages), the subsequent array operations on messages will fail. Consider adding runtime type validation.
There was a problem hiding this comment.
Code Review
This pull request introduces a volume and complexity detector for adaptive routing, which analyzes request characteristics like batch size, token estimates, and tool usage to recommend strategy overrides. The feedback suggests tightening the test assertions to ensure specific complexity levels are correctly identified and improving the consistency of the override signaling logic to ensure that flags like economy or premium preferences are always applied correctly even when the strategy name doesn't change.
| assert.ok( | ||
| signals.complexity === "critical" || signals.complexity === "high", | ||
| `expected critical or high, got ${signals.complexity}` | ||
| ); |
There was a problem hiding this comment.
The assertion here is too lenient. Given the input Refactor the authentication module for production, the HIGH_COMPLEXITY_KEYWORDS like auth (from authentication) and production will be matched. According to the logic in detectVolumeSignals, this should always result in a critical complexity.
The current check for either critical or high could mask future regressions. The test should be more specific to enforce the expected behavior.
assert.equal(signals.complexity, "critical", `expected critical, got ${signals.complexity}`);| // Rule 2: Medium batch with low complexity → cost-optimized | ||
| if (signals.batchSize >= 10 && signals.complexity === "low") { | ||
| return { | ||
| shouldOverride: currentStrategy !== "cost-optimized", |
There was a problem hiding this comment.
The logic for shouldOverride here and in Rule 4 (line 192) is inconsistent with Rule 1 and Rule 3. Rules 1 and 3 always set shouldOverride: true, creating a clear signal to the caller to adopt the new configuration. This rule, however, sets shouldOverride to false if the strategy name already matches.
This can lead to bugs. For example, if the current strategy is cost-optimized but doesn't have preferEconomy: true, this rule will return shouldOverride: false, and the caller might miss applying the preferEconomy flag.
For a cleaner and more robust API contract, I recommend setting shouldOverride: true whenever a rule match recommends a specific configuration. This simplifies the consumer's logic: if shouldOverride is true, apply the new strategy and its associated flags.
| shouldOverride: currentStrategy !== "cost-optimized", | |
| shouldOverride: true, |
| // Rule 4: Browser/UI interaction → force priority with premium | ||
| if (signals.hasBrowser) { | ||
| return { | ||
| shouldOverride: currentStrategy !== "priority", |
There was a problem hiding this comment.
Similar to the comment on Rule 2 (line 170), setting shouldOverride: true here would make the behavior consistent with Rule 3 (critical complexity), which also forces a priority override. This ensures that important flags like forcePremium: true are always applied by the caller when this rule is triggered, creating a more robust and predictable API.
| shouldOverride: currentStrategy !== "priority", | |
| shouldOverride: true, |
Code Review SummaryStatus: 3 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)WARNING
Files Reviewed (2 files)
Positive observations:
Note: The test file could benefit from edge case tests (e.g., empty arrays, null values in body) to ensure robustness, but this is not blocking. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a427d215e3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const lowerSerialized = serialized.toLowerCase(); | ||
| const hasBrowser = BROWSER_KEYWORDS.some((kw) => lowerSerialized.includes(kw)); |
There was a problem hiding this comment.
Include tools when deriving browser interaction signals
hasBrowser is computed only from messages text, so requests that express browser work via structured tools (e.g. computer_use) but neutral prompt text will be misclassified as non-browser. In that case a single-tool request lands at medium complexity and recommendStrategyOverride will not force priority/forcePremium, which is a routing regression for browser-automation workloads this detector is meant to catch.
Useful? React with 👍 / 👎.
| const hasHighKeywords = HIGH_COMPLEXITY_KEYWORDS.some((kw) => lowerSerialized.includes(kw)); | ||
| let complexity: VolumeSignals["complexity"]; | ||
|
|
||
| if (toolCount > 3 || (hasBrowser && toolCount > 1) || hasHighKeywords) { | ||
| complexity = "critical"; |
There was a problem hiding this comment.
Match complexity keywords on word boundaries
High-complexity detection uses raw substring checks, and the keyword list contains "auth"; this makes benign prompts containing words like "author" evaluate as high-risk. That path sets complexity to critical, which then forces premium-first routing in Rule 3 and can unnecessarily raise cost/latency for ordinary requests. Use token/word-boundary matching (or less ambiguous keywords) to avoid these false positives.
Useful? React with 👍 / 👎.
|
Olá, Professor Igor! Fantástica contribuição introduzindo o volume rounter! 📈⚙️ O roteamento adaptativo e ciente do conteúdo das requisições é um dos grandes diferenciais do OmniRoute. A forma como seu código analisa o "batchSize", detecção de browser flags e volume de texto para tomar decisões como "pular ferramentas não essenciais" ou "usar um serviço com melhor custo-benefício" é extremamente elegante! E parabéns pelos testes minuciosos no arquivo de teste incluso. Existem no entanto pontos obrigatórios para ajustes e integração antes de conseguirmos consolidá-la na branch O Fio Desconectado (Falta a Integração Final)Seu código construiu uma fechadura de ponta, porém ele esqueceu de ser colocado na porta! O módulo Sua missão para concluir este PR:
Estamos super no aguardo dessa correção pois isso cortará drasticamente os custos da nuvem em requisições de batch e fará o roteamento parecer magica! Muito bom trabalho e foco nessa próxima etapa! 🚀 |
Summary
volumeDetector.tsmodule toopen-sse/services/Motivation
Current combo strategies are static — a batch of 500 classification items and a single critical deployment both use the same routing strategy. This module analyzes the request and recommends strategy overrides:
Integration
Call from
combo.tsbefore selecting a model:Test plan
Relates to #789
🤖 Generated with Claude Code