Skip to content

Conversation

@tantara
Copy link
Member

@tantara tantara commented Oct 26, 2025

Summary

Add foundation for embedding-based semantic search with comprehensive infrastructure for 768-dimensional vectors, platform-aware MLX support, and CLI integration.

Key Features

🔢 Embedding Infrastructure

  • Database: Add nullable embedding BLOB column to messages table (768 dimensions)
  • Model: Update Message struct with Option<Vec<f32>> embedding field
  • Serialization: Implement f32 ↔ bytes conversion for efficient storage
  • Builder pattern: Add with_embedding() helper method

🤖 Embedding Service (src/services/embedding_service.rs)

  • Platform detection: Automatically detects macOS vs Windows/Linux
  • MLX support: Conditional integration via RETROCHAT_USE_MLX env var
  • Dummy embeddings: Deterministic hash-based 768-dim vectors for development
  • L2 normalization: All embeddings normalized to unit length
  • Graceful degradation: Warnings on unsupported platforms
  • Test coverage: 4/4 tests passing ✅

🔍 CLI Integration

  • Add --use-embedding flag to search commands:
    • retrochat search <query> --use-embedding
    • retrochat query search <query> --use-embedding
  • Compatible with existing time range filters
  • Sets search_type = "embedding" for service routing

📦 Dependencies

  • sqlite-vec v0.1.6: Vector similarity search engine
  • mlx-rs v0.25: ML framework (macOS-only, optional)
  • New mlx feature flag for conditional compilation

Implementation Details

Database Layer

File: src/database/message_repo.rs

  • embedding_to_blob(): Convert Vec → bytes for storage
  • blob_to_embedding(): Convert bytes → Vec for retrieval
  • Updated all SQL queries to include embedding column
  • Handles NULL embeddings gracefully

Service Layer

File: src/services/embedding_service.rs

// Check if MLX is available
let service = EmbeddingService::new();

// Generate embedding (currently dummy)
let embedding = service.generate_embedding("text")?;
assert_eq!(embedding.len(), 768);

Features:

  • Platform-aware initialization
  • Deterministic dummy embeddings using hash-based RNG
  • Ready for actual MLX model integration
  • Full error handling

Environment Configuration

File: src/env.rs

# Enable MLX embeddings (macOS only)
export RETROCHAT_USE_MLX=true

Migration

File: migrations/008_add_message_embeddings.sql

ALTER TABLE messages ADD COLUMN embedding BLOB;
  • Backwards compatible (NULL allowed)
  • Existing data unaffected

Usage Examples

# Basic embedding search
retrochat search "machine learning concepts" --use-embedding

# With time range
retrochat search "performance optimization" --use-embedding \
  --since "7 days ago" --until "now"

# With result limit
retrochat search "bug fix" --use-embedding --limit 10

# Enable MLX on macOS
export RETROCHAT_USE_MLX=true
retrochat search "semantic query" --use-embedding

Testing

Passing Tests ✅

  • test_dummy_embedding_generation - 768-dim vectors generated correctly
  • test_embedding_deterministic - Same input → same output
  • test_embedding_different_text - Different inputs → different outputs
  • test_platform_support_check - Platform detection works

CLI Tests Updated ✅

  • test_search_command_structure - Includes use_embedding field
  • test_search_command_with_time_range - Time + embedding flags
  • test_search_command_with_embedding - New test for embedding flag

Known Issues ⚠️

  • Search integration tests fail due to missing migration in test DB
  • Need to ensure migrations run in test environment
  • Alternative: Make embedding column truly optional in queries

Technical Notes

Embedding Format

  • Dimensions: 768 (standard for many embedding models)
  • Type: f32 (single-precision float)
  • Storage: Little-endian bytes in SQLite BLOB
  • Normalization: L2-normalized to unit length

Platform Support

Platform MLX Support Embedding Generation
macOS ✅ Available MLX or Dummy
Linux ❌ N/A Dummy only
Windows ❌ N/A Dummy only

Dummy Embedding Algorithm

  1. Hash input text to seed value
  2. Use Linear Congruential Generator (LCG) for deterministic randomness
  3. Generate 768 values in [-1, 1] range
  4. L2-normalize to unit length

Architecture

┌─────────────────────────────────────────────────────┐
│                  CLI Layer                          │
│  ├─ search --use-embedding                          │
│  └─ query search --use-embedding                    │
└────────────────────┬────────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────────┐
│               Service Layer                         │
│  ├─ QueryService (routes to vector/FTS search)     │
│  └─ EmbeddingService (generates embeddings)        │
└────────────────────┬────────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────────┐
│             Repository Layer                        │
│  ├─ MessageRepository (CRUD + embedding storage)   │
│  └─ (TODO: Vector search with sqlite-vec)          │
└─────────────────────────────────────────────────────┘

Next Steps (Future PRs)

  1. Vector Search Implementation

    • Use sqlite-vec for similarity search
    • Implement k-NN retrieval
    • Add distance/similarity scoring
  2. MLX Integration

    • Load embedding model
    • Implement actual inference
    • Handle model caching
  3. Embedding Generation Flow

    • Auto-generate embeddings on message import
    • Batch processing for efficiency
    • Background job for existing messages
  4. Search Enhancement

    • Hybrid search (combine FTS + vector)
    • Relevance score tuning
    • Query embedding caching

Files Changed

New Files

  • migrations/008_add_message_embeddings.sql - Database migration
  • src/services/embedding_service.rs - Embedding generation service

Modified Files

  • Cargo.toml - Add dependencies (sqlite-vec, mlx-rs)
  • Cargo.lock - Dependency lock file
  • src/cli/mod.rs - Add --use-embedding flag
  • src/cli/query.rs - Update search handler
  • src/database/message_repo.rs - Embedding storage/retrieval
  • src/env.rs - Add RETROCHAT_USE_MLX
  • src/models/message.rs - Add embedding field
  • src/services/mod.rs - Export EmbeddingService
  • tests/contract/test_cli_add_command.rs - Update CLI tests

Breaking Changes

None - all changes are additive and backwards compatible.

🤖 Generated with Claude Code

tantara and others added 4 commits October 26, 2025 08:26
…undation

Add foundation for embedding-based semantic search with platform-aware MLX support.

## Key Features

### Embedding Infrastructure
- Add `embedding` column (BLOB, nullable) to messages table for 768-dimensional vectors
- Update Message model with optional `embedding: Option<Vec<f32>>` field
- Implement embedding serialization/deserialization (f32 ↔ bytes)
- Add `with_embedding()` builder method to Message

### Embedding Service
- Create `EmbeddingService` with platform detection
- Support for MLX on macOS (via `RETROCHAT_USE_MLX` env var)
- Deterministic dummy 768-dimensional embeddings for development
- L2-normalized vectors for consistency
- Graceful warnings on unsupported platforms (Windows/Linux)
- Full test coverage (4/4 tests passing)

### CLI Enhancement
- Add `--use-embedding` flag to search commands
- Available in both `retrochat search` and `query search`
- Sets `search_type = "embedding"` for service layer routing

### Dependencies
- Add `sqlite-vec` v0.1.6 for vector similarity search
- Add `mlx-rs` v0.25 (macOS-only, optional) for future ML integration
- Create `mlx` feature flag for conditional compilation

## Implementation Details

**Database Layer** (`src/database/message_repo.rs`):
- `embedding_to_blob()` - Convert f32 vectors to bytes
- `blob_to_embedding()` - Convert bytes back to f32 vectors
- Updated all INSERT/SELECT queries to handle embedding column

**Service Layer** (`src/services/embedding_service.rs`):
- Platform-aware initialization with MLX support detection
- Deterministic hash-based dummy embeddings (768 dims)
- Ready for actual MLX model integration

**Environment** (`src/env.rs`):
- `RETROCHAT_USE_MLX` - Enable MLX embeddings (macOS only)
- Proper documentation for platform requirements

## Migration

- `008_add_message_embeddings.sql` - Adds nullable embedding column
- Backwards compatible (existing data works without embeddings)

## Usage

```bash
# Enable MLX embeddings on macOS
export RETROCHAT_USE_MLX=true

# Use embedding-based search
retrochat search "machine learning" --use-embedding

# With time range
retrochat search "performance" --use-embedding --since "7 days ago"
```

## Testing

- ✅ Embedding service tests passing (4/4)
- ✅ CLI command structure tests updated
- ✅ Platform detection tests
- ⚠️ Search integration tests require migration run

## Next Steps

- Implement vector similarity search using sqlite-vec
- Add actual MLX model inference for embedding generation
- Route search requests to vector search when `use_embedding` is true
- Integrate embedding generation into message import flow

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Resolved conflicts in:
- Cargo.toml: Combined both sets of dependencies (sqlite-vec, regex, lazy_static, mlx-rs)
- Cargo.lock: Regenerated after dependency merge
- src/database/message_repo.rs: Merged to include both message_type/tool_operation_id from main and embedding from feature branch

All SQL queries now include the full set of fields:
- message_type and tool_operation_id (from main)
- embedding (from feature/embedding)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
…grations

After merging main, we had two migration files numbered 008:
- 008_add_tool_operations.sql (from main)
- 008_add_message_embeddings.sql (from feature/embedding)

Renamed our embedding migration to 011_add_message_embeddings.sql
to maintain proper migration sequence.

All tests now pass.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Clippy fixes:
- Replace manual `% 4 != 0` with `.is_multiple_of(4)` in blob_to_embedding
- Box Message fields in MessageGroup::ToolPair to reduce enum size (416 bytes → smaller)

SQL query fixes:
- Add `embedding` column to all SELECT queries in message_repo.rs:
  - search_content_with_filters
  - search_content_with_time_filters
  - get_by_time_range

This ensures all queries return the complete Message structure including
the new embedding field from migration 011.

All CI checks now pass: formatting, clippy, and tests.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants