Skip to content

Conversation

@sanggggg
Copy link
Collaborator

@sanggggg sanggggg commented Jan 2, 2026

Plan for implementing semantic search with local embeddings:

  • LanceDB for vector storage (embedded, serverless)
  • FastEmbed-rs for local embedding generation (ONNX-based)
  • BGESmallENV15 as recommended embedding model
  • Hybrid search combining semantic + existing FTS5
  • 5-phase implementation roadmap

claude added 6 commits January 2, 2026 09:30
Plan for implementing semantic search with local embeddings:
- LanceDB for vector storage (embedded, serverless)
- FastEmbed-rs for local embedding generation (ONNX-based)
- BGESmallENV15 as recommended embedding model
- Hybrid search combining semantic + existing FTS5
- 5-phase implementation roadmap
- Add timestamp range filtering (started_at, created_at)
- Add provider, project, outcome filters for session search
- Add turn_type filter for turn search
- Document LanceDB scalar indexes for filter performance
- Add SQL filter syntax examples
- Update CLI commands with filter options
- Document LanceDB as fully embedded (no external server)
- Detail ONNX Runtime linking options (static/dynamic/download)
- Document model file strategies (runtime download/local/embedded)
- Recommend runtime download for balance of size vs convenience
- Add alternative compile-time embedding for offline deployments
- Update risks table with ONNX linking mitigation
- Add to_embedding_text() implementations for TurnSummary and SessionSummary
- Include all relevant fields: user_intent, assistant_action, summary, key_topics, decisions_made, code_concepts
- SessionSummary embedding includes ChatSession context (provider, project)
- Add TurnEmbedding and SessionEmbedding data structures
- Document complete indexing flow with change detection via SHA256 hash
- Add example output for both embedding text formats
Implement Phase 1 of semantic search for turn and session summaries:

- Add embedding/ module with EmbeddingService wrapping FastEmbed-rs
  - Support for multiple models (BGESmallENV15, AllMiniLML6V2, etc.)
  - Configurable cache directory and download progress
  - Batch embedding generation for efficiency

- Add vector_store/ module with VectorStore wrapping LanceDB
  - Arrow schemas for turn and session embeddings
  - Upsert, search, and delete operations
  - Metadata filtering via SQL WHERE clauses (date ranges, types, providers)
  - Cosine similarity search with configurable limits

- Add to_embedding_text() and embedding_text_hash() to TurnSummary
  - Combines intent, action, summary, type, topics, decisions, concepts
  - SHA256 hash for change detection to skip unchanged re-embedding

- Add to_embedding_text() and embedding_text_hash() to SessionSummary
  - Combines title, summary, goal, outcome, provider, project, entities
  - Context-aware hash including provider and project

- Feature-gate all semantic search code behind "semantic-search" feature
- Add workspace dependencies: lancedb, arrow, fastembed, sha2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants