Problem
Every Qwen chat/voice utterance runs apply_reasoning_mode before the LLM sees messages. On main, explicit_reasoning_mode, is_simple_request, and looks_like_deep_reasoning_request each call to_lowercase() independently — up to three full-string allocations on the same user text.
Proposed fix
Behavior-preserving optimization in crates/genie-core/src/reasoning.rs:
- Single shared
to_lowercase() in the Qwen path.
- Conservative
needs_simple_request_scan / needs_deep_reasoning_scan early-outs before expensive marker loops.
- Corpus regression tests + optional release bench.
Scope
Pre-LLM performance bucket under #402. Does not touch tools/quick.rs (avoids collision with open quick-router PRs). Complements #501 (broader pre-LLM bundle) and merged voice-intent work.
Acceptance
ReasoningDecision + adjusted message content byte-identical to main for a regression corpus.
cargo test -p genie-core + cargo clippy + cargo fmt clean.
Problem
Every Qwen chat/voice utterance runs
apply_reasoning_modebefore the LLM sees messages. Onmain,explicit_reasoning_mode,is_simple_request, andlooks_like_deep_reasoning_requesteach callto_lowercase()independently — up to three full-string allocations on the same user text.Proposed fix
Behavior-preserving optimization in
crates/genie-core/src/reasoning.rs:to_lowercase()in the Qwen path.needs_simple_request_scan/needs_deep_reasoning_scanearly-outs before expensive marker loops.Scope
Pre-LLM performance bucket under #402. Does not touch
tools/quick.rs(avoids collision with open quick-router PRs). Complements #501 (broader pre-LLM bundle) and merged voice-intent work.Acceptance
ReasoningDecision+ adjusted message content byte-identical tomainfor a regression corpus.cargo test -p genie-core+cargo clippy+cargo fmtclean.