yichuan-w · andylizf · Dec 31, 2025 · Nov 15, 2025 · Dec 31, 2025
diff --git a/README.md b/README.md
@@ -179,6 +179,35 @@ chat = LeannChat(INDEX_PATH, llm_config={"type": "hf", "model": "Qwen/Qwen3-0.6B
 response = chat.ask("How much storage does LEANN save?", top_k=1)
 ```
 
+## Performance Optimization: Task-Specific Prompt Templates
+
+LEANN now supports prompt templates for task-specific embedding models like Google's EmbeddingGemma. This feature enables **significant performance gains** by using smaller, faster models without sacrificing search quality.
+
+### Real-World Performance
+
+**Benchmark (MacBook M1 Pro, LM Studio):**
+- **EmbeddingGemma 300M (QAT)** with templates: **4-5x faster** than Qwen 600M
+- **Search quality:** Identical ranking to larger models
+- **Use case:** Ideal for real-time workflows (e.g., pre-commit hooks in Claude Code; ~7min for whole LEANN's code + doc files on MacBook M1 Pro)
+
+### Quick Example
+
+```bash
+# Build index with task-specific templates
+leann build my-index ./docs \
+  --embedding-mode ollama \
+  --embedding-model embeddinggemma \
+  --embedding-prompt-template "title: none | text: " \
+  --query-prompt-template "task: search result | query: "
+
+# Search automatically applies query template
+leann search my-index "How does LEANN optimize vector search?"
+```
+
+Templates are automatically persisted and applied during searches (CLI, MCP, API). No manual configuration needed after indexing.
+
+See [Configuration Guide](docs/configuration-guide.md#task-specific-prompt-templates) for detailed usage and model recommendations.
+
 ## RAG on Everything!
 
 LEANN supports RAG on various data sources including documents (`.pdf`, `.txt`, `.md`), Apple Mail, Google Search History, WeChat, ChatGPT conversations, Claude conversations, iMessage conversations, and **live data from any platform through MCP (Model Context Protocol) servers** - including Slack, Twitter, and more.

diff --git a/docs/configuration-guide.md b/docs/configuration-guide.md
@@ -185,10 +185,25 @@ leann search my-docs \
   --embedding-prompt-template "task: search result | query: "
 ```
 
+A full example that is used for building the LEANN's repo during dev:
+```
+source "$LEANN_PATH/.venv/bin/activate" && \
+leann build --docs $(git ls-files | grep -Ev '\.(png|jpg|jpeg|gif|yml|yaml|sh|pdf|JPG)$') --embedding-mode openai \
+--embedding-model text-embedding-embeddinggemma-300m-qat \
+--embedding-prompt-template "title: none | text: " \
+--query-prompt-template "task: search result | query: " \
+--embedding-api-key local-dev-key \
+--embedding-api-base http://localhost:1234/v1 \
+--doc-chunk-size 1024 --doc-chunk-overlap 100 \
+--code-chunk-size 1024 --code-chunk-overlap 100 \
+--ast-chunk-size 1024 --ast-chunk-overlap 100 \
+--force --use-ast-chunking --no-compact --no-recompute
+```
+
 **Important Notes:**
 - **Only use with compatible models**: EmbeddingGemma and similar task-specific models
 - **NOT for regular models**: Adding prompts to models like `nomic-embed-text`, `text-embedding-3-small`, or `bge-base-en-v1.5` will corrupt embeddings
-- **Template is saved**: Build-time templates are saved to `.meta.json` for reference
+- **Template is saved**: Build-time templates are saved to `.meta.json` for reference; you can add both `--embedding-prompt-template` and `--query-prompt-template` values during building phase, and this way the mcp query will automatically pick up the query template
 - **Flexible prompts**: You can use any prompt string, or leave it empty (`""`)
 
 **Python API:**