feat: add embedding support with MLX integration and vector search foundation #48

tantara · 2025-10-26T13:26:45Z

Summary

Add foundation for embedding-based semantic search with comprehensive infrastructure for 768-dimensional vectors, platform-aware MLX support, and CLI integration.

Key Features

🔢 Embedding Infrastructure

Database: Add nullable embedding BLOB column to messages table (768 dimensions)
Model: Update Message struct with Option<Vec<f32>> embedding field
Serialization: Implement f32 ↔ bytes conversion for efficient storage
Builder pattern: Add with_embedding() helper method

🤖 Embedding Service (`src/services/embedding_service.rs`)

Platform detection: Automatically detects macOS vs Windows/Linux
MLX support: Conditional integration via RETROCHAT_USE_MLX env var
Dummy embeddings: Deterministic hash-based 768-dim vectors for development
L2 normalization: All embeddings normalized to unit length
Graceful degradation: Warnings on unsupported platforms
Test coverage: 4/4 tests passing ✅

🔍 CLI Integration

Add --use-embedding flag to search commands:
- retrochat search <query> --use-embedding
- retrochat query search <query> --use-embedding
Compatible with existing time range filters
Sets search_type = "embedding" for service routing

📦 Dependencies

sqlite-vec v0.1.6: Vector similarity search engine
mlx-rs v0.25: ML framework (macOS-only, optional)
New mlx feature flag for conditional compilation

Implementation Details

Database Layer

File: src/database/message_repo.rs

embedding_to_blob(): Convert Vec → bytes for storage
blob_to_embedding(): Convert bytes → Vec for retrieval
Updated all SQL queries to include embedding column
Handles NULL embeddings gracefully

Service Layer

File: src/services/embedding_service.rs

// Check if MLX is available
let service = EmbeddingService::new();

// Generate embedding (currently dummy)
let embedding = service.generate_embedding("text")?;
assert_eq!(embedding.len(), 768);

Features:

Platform-aware initialization
Deterministic dummy embeddings using hash-based RNG
Ready for actual MLX model integration
Full error handling

Environment Configuration

File: src/env.rs

# Enable MLX embeddings (macOS only)
export RETROCHAT_USE_MLX=true

Migration

File: migrations/008_add_message_embeddings.sql

ALTER TABLE messages ADD COLUMN embedding BLOB;

Backwards compatible (NULL allowed)
Existing data unaffected

Usage Examples

# Basic embedding search
retrochat search "machine learning concepts" --use-embedding

# With time range
retrochat search "performance optimization" --use-embedding \
  --since "7 days ago" --until "now"

# With result limit
retrochat search "bug fix" --use-embedding --limit 10

# Enable MLX on macOS
export RETROCHAT_USE_MLX=true
retrochat search "semantic query" --use-embedding

Testing

Passing Tests ✅

test_dummy_embedding_generation - 768-dim vectors generated correctly
test_embedding_deterministic - Same input → same output
test_embedding_different_text - Different inputs → different outputs
test_platform_support_check - Platform detection works

CLI Tests Updated ✅

test_search_command_structure - Includes use_embedding field
test_search_command_with_time_range - Time + embedding flags
test_search_command_with_embedding - New test for embedding flag

Known Issues ⚠️

Search integration tests fail due to missing migration in test DB
Need to ensure migrations run in test environment
Alternative: Make embedding column truly optional in queries

Technical Notes

Embedding Format

Dimensions: 768 (standard for many embedding models)
Type: f32 (single-precision float)
Storage: Little-endian bytes in SQLite BLOB
Normalization: L2-normalized to unit length

Platform Support

Platform	MLX Support	Embedding Generation
macOS	✅ Available	MLX or Dummy
Linux	❌ N/A	Dummy only
Windows	❌ N/A	Dummy only

Dummy Embedding Algorithm

Hash input text to seed value
Use Linear Congruential Generator (LCG) for deterministic randomness
Generate 768 values in [-1, 1] range
L2-normalize to unit length

Architecture

┌─────────────────────────────────────────────────────┐
│                  CLI Layer                          │
│  ├─ search --use-embedding                          │
│  └─ query search --use-embedding                    │
└────────────────────┬────────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────────┐
│               Service Layer                         │
│  ├─ QueryService (routes to vector/FTS search)     │
│  └─ EmbeddingService (generates embeddings)        │
└────────────────────┬────────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────────┐
│             Repository Layer                        │
│  ├─ MessageRepository (CRUD + embedding storage)   │
│  └─ (TODO: Vector search with sqlite-vec)          │
└─────────────────────────────────────────────────────┘

Next Steps (Future PRs)

Vector Search Implementation
- Use sqlite-vec for similarity search
- Implement k-NN retrieval
- Add distance/similarity scoring
MLX Integration
- Load embedding model
- Implement actual inference
- Handle model caching
Embedding Generation Flow
- Auto-generate embeddings on message import
- Batch processing for efficiency
- Background job for existing messages
Search Enhancement
- Hybrid search (combine FTS + vector)
- Relevance score tuning
- Query embedding caching

Files Changed

New Files

migrations/008_add_message_embeddings.sql - Database migration
src/services/embedding_service.rs - Embedding generation service

Modified Files

Cargo.toml - Add dependencies (sqlite-vec, mlx-rs)
Cargo.lock - Dependency lock file
src/cli/mod.rs - Add --use-embedding flag
src/cli/query.rs - Update search handler
src/database/message_repo.rs - Embedding storage/retrieval
src/env.rs - Add RETROCHAT_USE_MLX
src/models/message.rs - Add embedding field
src/services/mod.rs - Export EmbeddingService
tests/contract/test_cli_add_command.rs - Update CLI tests

Breaking Changes

None - all changes are additive and backwards compatible.

🤖 Generated with Claude Code

…undation Add foundation for embedding-based semantic search with platform-aware MLX support. ## Key Features ### Embedding Infrastructure - Add `embedding` column (BLOB, nullable) to messages table for 768-dimensional vectors - Update Message model with optional `embedding: Option<Vec<f32>>` field - Implement embedding serialization/deserialization (f32 ↔ bytes) - Add `with_embedding()` builder method to Message ### Embedding Service - Create `EmbeddingService` with platform detection - Support for MLX on macOS (via `RETROCHAT_USE_MLX` env var) - Deterministic dummy 768-dimensional embeddings for development - L2-normalized vectors for consistency - Graceful warnings on unsupported platforms (Windows/Linux) - Full test coverage (4/4 tests passing) ### CLI Enhancement - Add `--use-embedding` flag to search commands - Available in both `retrochat search` and `query search` - Sets `search_type = "embedding"` for service layer routing ### Dependencies - Add `sqlite-vec` v0.1.6 for vector similarity search - Add `mlx-rs` v0.25 (macOS-only, optional) for future ML integration - Create `mlx` feature flag for conditional compilation ## Implementation Details **Database Layer** (`src/database/message_repo.rs`): - `embedding_to_blob()` - Convert f32 vectors to bytes - `blob_to_embedding()` - Convert bytes back to f32 vectors - Updated all INSERT/SELECT queries to handle embedding column **Service Layer** (`src/services/embedding_service.rs`): - Platform-aware initialization with MLX support detection - Deterministic hash-based dummy embeddings (768 dims) - Ready for actual MLX model integration **Environment** (`src/env.rs`): - `RETROCHAT_USE_MLX` - Enable MLX embeddings (macOS only) - Proper documentation for platform requirements ## Migration - `008_add_message_embeddings.sql` - Adds nullable embedding column - Backwards compatible (existing data works without embeddings) ## Usage ```bash # Enable MLX embeddings on macOS export RETROCHAT_USE_MLX=true # Use embedding-based search retrochat search "machine learning" --use-embedding # With time range retrochat search "performance" --use-embedding --since "7 days ago" ``` ## Testing - ✅ Embedding service tests passing (4/4) - ✅ CLI command structure tests updated - ✅ Platform detection tests - ⚠️ Search integration tests require migration run ## Next Steps - Implement vector similarity search using sqlite-vec - Add actual MLX model inference for embedding generation - Route search requests to vector search when `use_embedding` is true - Integrate embedding generation into message import flow 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Resolved conflicts in: - Cargo.toml: Combined both sets of dependencies (sqlite-vec, regex, lazy_static, mlx-rs) - Cargo.lock: Regenerated after dependency merge - src/database/message_repo.rs: Merged to include both message_type/tool_operation_id from main and embedding from feature branch All SQL queries now include the full set of fields: - message_type and tool_operation_id (from main) - embedding (from feature/embedding) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

…grations After merging main, we had two migration files numbered 008: - 008_add_tool_operations.sql (from main) - 008_add_message_embeddings.sql (from feature/embedding) Renamed our embedding migration to 011_add_message_embeddings.sql to maintain proper migration sequence. All tests now pass. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Clippy fixes: - Replace manual `% 4 != 0` with `.is_multiple_of(4)` in blob_to_embedding - Box Message fields in MessageGroup::ToolPair to reduce enum size (416 bytes → smaller) SQL query fixes: - Add `embedding` column to all SELECT queries in message_repo.rs: - search_content_with_filters - search_content_with_time_filters - get_by_time_range This ensures all queries return the complete Message structure including the new embedding field from migration 011. All CI checks now pass: formatting, clippy, and tests. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

tantara and others added 4 commits October 26, 2025 08:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add embedding support with MLX integration and vector search foundation #48

feat: add embedding support with MLX integration and vector search foundation #48

Uh oh!

tantara commented Oct 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: add embedding support with MLX integration and vector search foundation #48

Are you sure you want to change the base?

feat: add embedding support with MLX integration and vector search foundation #48

Uh oh!

Conversation

tantara commented Oct 26, 2025

Summary

Key Features

🔢 Embedding Infrastructure

🤖 Embedding Service (src/services/embedding_service.rs)

🔍 CLI Integration

📦 Dependencies

Implementation Details

Database Layer

Service Layer

Environment Configuration

Migration

Usage Examples

Testing

Passing Tests ✅

CLI Tests Updated ✅

Known Issues ⚠️

Technical Notes

Embedding Format

Platform Support

Dummy Embedding Algorithm

Architecture

Next Steps (Future PRs)

Files Changed

New Files

Modified Files

Breaking Changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

🤖 Embedding Service (`src/services/embedding_service.rs`)