diff --git a/README.md b/README.md index 841485b4..66cf9e1c 100755 --- a/README.md +++ b/README.md @@ -179,6 +179,35 @@ chat = LeannChat(INDEX_PATH, llm_config={"type": "hf", "model": "Qwen/Qwen3-0.6B response = chat.ask("How much storage does LEANN save?", top_k=1) ``` +## Performance Optimization: Task-Specific Prompt Templates + +LEANN now supports prompt templates for task-specific embedding models like Google's EmbeddingGemma. This feature enables **significant performance gains** by using smaller, faster models without sacrificing search quality. + +### Real-World Performance + +**Benchmark (MacBook M1 Pro, LM Studio):** +- **EmbeddingGemma 300M (QAT)** with templates: **4-5x faster** than Qwen 600M +- **Search quality:** Identical ranking to larger models +- **Use case:** Ideal for real-time workflows (e.g., pre-commit hooks in Claude Code; ~7min for whole LEANN's code + doc files on MacBook M1 Pro) + +### Quick Example + +```bash +# Build index with task-specific templates +leann build my-index ./docs \ + --embedding-mode ollama \ + --embedding-model embeddinggemma \ + --embedding-prompt-template "title: none | text: " \ + --query-prompt-template "task: search result | query: " + +# Search automatically applies query template +leann search my-index "How does LEANN optimize vector search?" +``` + +Templates are automatically persisted and applied during searches (CLI, MCP, API). No manual configuration needed after indexing. + +See [Configuration Guide](docs/configuration-guide.md#task-specific-prompt-templates) for detailed usage and model recommendations. + ## RAG on Everything! LEANN supports RAG on various data sources including documents (`.pdf`, `.txt`, `.md`), Apple Mail, Google Search History, WeChat, ChatGPT conversations, Claude conversations, iMessage conversations, and **live data from any platform through MCP (Model Context Protocol) servers** - including Slack, Twitter, and more. diff --git a/docs/configuration-guide.md b/docs/configuration-guide.md index 570745f5..70d4d17f 100644 --- a/docs/configuration-guide.md +++ b/docs/configuration-guide.md @@ -185,10 +185,25 @@ leann search my-docs \ --embedding-prompt-template "task: search result | query: " ``` +A full example that is used for building the LEANN's repo during dev: +``` +source "$LEANN_PATH/.venv/bin/activate" && \ +leann build --docs $(git ls-files | grep -Ev '\.(png|jpg|jpeg|gif|yml|yaml|sh|pdf|JPG)$') --embedding-mode openai \ +--embedding-model text-embedding-embeddinggemma-300m-qat \ +--embedding-prompt-template "title: none | text: " \ +--query-prompt-template "task: search result | query: " \ +--embedding-api-key local-dev-key \ +--embedding-api-base http://localhost:1234/v1 \ +--doc-chunk-size 1024 --doc-chunk-overlap 100 \ +--code-chunk-size 1024 --code-chunk-overlap 100 \ +--ast-chunk-size 1024 --ast-chunk-overlap 100 \ +--force --use-ast-chunking --no-compact --no-recompute +``` + **Important Notes:** - **Only use with compatible models**: EmbeddingGemma and similar task-specific models - **NOT for regular models**: Adding prompts to models like `nomic-embed-text`, `text-embedding-3-small`, or `bge-base-en-v1.5` will corrupt embeddings -- **Template is saved**: Build-time templates are saved to `.meta.json` for reference +- **Template is saved**: Build-time templates are saved to `.meta.json` for reference; you can add both `--embedding-prompt-template` and `--query-prompt-template` values during building phase, and this way the mcp query will automatically pick up the query template - **Flexible prompts**: You can use any prompt string, or leave it empty (`""`) **Python API:**