Skip to content

feat(retrieval): add LLM query expansion with multi-query fusion#159

Open
Salomondiei08 wants to merge 1 commit intoEverMind-AI:mainfrom
Salomondiei08:feature/query-expansion
Open

feat(retrieval): add LLM query expansion with multi-query fusion#159
Salomondiei08 wants to merge 1 commit intoEverMind-AI:mainfrom
Salomondiei08:feature/query-expansion

Conversation

@Salomondiei08
Copy link
Copy Markdown

Checklist

  • My code follows the project's code style guidelines
  • I have performed a self-review of my code
  • I have commented my code where necessary, particularly in complex areas
  • I have updated the documentation accordingly
  • My changes generate no new warnings or errors
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have used Gitmoji in my commit messages
  • Any dependent changes have been merged and published

Additional Notes

  • _search_hybrid is unchanged and still used by the AGENTIC and RRF retrieval paths — only the HYBRID path is affected
  • Query expansion adds one LLM call per retrieval request (~200ms overhead). The fallback ensures zero regression if the LLM is unavailable
  • Recall improvement is measurable via Recall@K with single-query vs expanded-query baselines
  • Expansion temperature is set to 0.6 to encourage vocabulary diversity across variants

Breaking Changes

None. _search_hybrid is preserved as-is. The new routing in retrieve_mem_hybrid is fully backward compatible and falls back to the original behavior on any failure.

Addresses vocabulary mismatch failure mode in hybrid retrieval where stored
memories use different terms than the query (e.g. 'rescue inhaler protocol'
stored vs 'gym bag' queried).

Changes:
- src/memory_layer/query_expansion.py: new module implementing expand_query()
  which generates 2-3 LLM paraphrase variants of a query using a structured
  prompt. Falls back silently (returns []) on any LLM failure.
  Also provides merge_hits_by_id() for union-dedup of per-query result sets.

- src/agentic_layer/memory_manager.py: retrieve_mem_hybrid() now routes
  through _search_hybrid_with_query_expansion() instead of _search_hybrid().
  This runs keyword+vector search in parallel for the original query and each
  variant, merges results by memory id (original-query scores preserved),
  then reranks the union against the original query for consistent scoring.
  Falls back to plain hybrid search when expansion produces no variants.

Inspired by RAG-Fusion / HyDE. Recall improvement is measurable via
Recall@K with single-query vs expanded-query baselines.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant