Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -179,6 +179,35 @@ chat = LeannChat(INDEX_PATH, llm_config={"type": "hf", "model": "Qwen/Qwen3-0.6B
response = chat.ask("How much storage does LEANN save?", top_k=1)
```

## Performance Optimization: Task-Specific Prompt Templates

LEANN now supports prompt templates for task-specific embedding models like Google's EmbeddingGemma. This feature enables **significant performance gains** by using smaller, faster models without sacrificing search quality.

### Real-World Performance

**Benchmark (MacBook M1 Pro, LM Studio):**
- **EmbeddingGemma 300M (QAT)** with templates: **4-5x faster** than Qwen 600M
- **Search quality:** Identical ranking to larger models
- **Use case:** Ideal for real-time workflows (e.g., pre-commit hooks in Claude Code; ~7min for whole LEANN's code + doc files on MacBook M1 Pro)

### Quick Example

```bash
# Build index with task-specific templates
leann build my-index ./docs \
--embedding-mode ollama \
--embedding-model embeddinggemma \
--embedding-prompt-template "title: none | text: " \
--query-prompt-template "task: search result | query: "

# Search automatically applies query template
leann search my-index "How does LEANN optimize vector search?"
```

Templates are automatically persisted and applied during searches (CLI, MCP, API). No manual configuration needed after indexing.

See [Configuration Guide](docs/configuration-guide.md#task-specific-prompt-templates) for detailed usage and model recommendations.

## RAG on Everything!

LEANN supports RAG on various data sources including documents (`.pdf`, `.txt`, `.md`), Apple Mail, Google Search History, WeChat, ChatGPT conversations, Claude conversations, iMessage conversations, and **live data from any platform through MCP (Model Context Protocol) servers** - including Slack, Twitter, and more.
Expand Down
17 changes: 16 additions & 1 deletion docs/configuration-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -185,10 +185,25 @@ leann search my-docs \
--embedding-prompt-template "task: search result | query: "
```

A full example that is used for building the LEANN's repo during dev:
```
source "$LEANN_PATH/.venv/bin/activate" && \
leann build --docs $(git ls-files | grep -Ev '\.(png|jpg|jpeg|gif|yml|yaml|sh|pdf|JPG)$') --embedding-mode openai \
--embedding-model text-embedding-embeddinggemma-300m-qat \
--embedding-prompt-template "title: none | text: " \
--query-prompt-template "task: search result | query: " \
--embedding-api-key local-dev-key \
--embedding-api-base http://localhost:1234/v1 \
--doc-chunk-size 1024 --doc-chunk-overlap 100 \
--code-chunk-size 1024 --code-chunk-overlap 100 \
--ast-chunk-size 1024 --ast-chunk-overlap 100 \
--force --use-ast-chunking --no-compact --no-recompute
```

**Important Notes:**
- **Only use with compatible models**: EmbeddingGemma and similar task-specific models
- **NOT for regular models**: Adding prompts to models like `nomic-embed-text`, `text-embedding-3-small`, or `bge-base-en-v1.5` will corrupt embeddings
- **Template is saved**: Build-time templates are saved to `.meta.json` for reference
- **Template is saved**: Build-time templates are saved to `.meta.json` for reference; you can add both `--embedding-prompt-template` and `--query-prompt-template` values during building phase, and this way the mcp query will automatically pick up the query template
- **Flexible prompts**: You can use any prompt string, or leave it empty (`""`)

**Python API:**
Expand Down
Loading