@@ -287,37 +287,41 @@ Here are some models known to work (w/ chat template override when needed):
287
287
288
288
llama-server --jinja -fa -hf bartowski/Qwen2.5-7B-Instruct-GGUF:Q4_K_M
289
289
llama-server --jinja -fa -hf bartowski/Mistral-Nemo-Instruct-2407-GGUF:Q6_K_L
290
- llama-server --jinja -fa -hf bartowski/functionary-small-v3.2-GGUF:Q4_K_M
291
290
llama-server --jinja -fa -hf bartowski/Llama-3.3-70B-Instruct-GGUF:Q4_K_M
292
291
293
- # Native support for DeepSeek R1 works best w/ our own template (official template buggy)
292
+ # Native support for DeepSeek R1 works best w/ our template override (official template is buggy, although we do work around it )
294
293
295
294
llama-server --jinja -fa -hf bartowski/DeepSeek-R1-Distill-Qwen-7B-GGUF:Q6_K_L \
296
- --chat-template-file models/templates/llama-cpp-deepseek-r1.jinja
295
+ --chat-template-file models/templates/llama-cpp-deepseek-r1.jinja
297
296
298
297
llama-server --jinja -fa -hf bartowski/DeepSeek-R1-Distill-Qwen-32B-GGUF:Q4_K_M \
299
- --chat-template-file models/templates/llama-cpp-deepseek-r1.jinja
298
+ --chat-template-file models/templates/llama-cpp-deepseek-r1.jinja
300
299
301
300
# Native support requires the right template for these GGUFs:
302
301
302
+ llama-server --jinja -fa -hf bartowski/functionary-small-v3.2-GGUF:Q4_K_M
303
+ --chat-template-file models/templates/meetkai-functionary-medium-v3.2.jinja
304
+
303
305
llama-server --jinja -fa -hf bartowski/Hermes-2-Pro-Llama-3-8B-GGUF:Q4_K_M \
304
- --chat-template-file <( python scripts/get_chat_template.py NousResearch/ Hermes-2-Pro-Llama-3-8B tool_use )
306
+ --chat-template-file models/templates/NousResearch- Hermes-2-Pro-Llama-3-8B- tool_use.jinja
305
307
306
308
llama-server --jinja -fa -hf bartowski/Hermes-3-Llama-3.1-8B-GGUF:Q4_K_M \
307
- --chat-template-file <( python scripts/get_chat_template.py NousResearch/ Hermes-3-Llama-3.1-8B tool_use )
309
+ --chat-template-file models/templates/NousResearch- Hermes-3-Llama-3.1-8B- tool_use.jinja
308
310
309
311
llama-server --jinja -fa -hf bartowski/firefunction-v2-GGUF -hff firefunction-v2-IQ1_M.gguf \
310
- --chat-template-file <( python scripts/get_chat_template.py fireworks-ai/ llama-3-firefunction-v2 tool_use )
312
+ --chat-template-file models/templates/ fireworks-ai- llama-3-firefunction-v2.jinja
311
313
312
314
llama-server --jinja -fa -hf bartowski/c4ai-command-r7b-12-2024-GGUF:Q6_K_L \
313
- --chat-template-file <( python scripts/get_chat_template.py CohereForAI/ c4ai-command-r7b-12-2024 tool_use )
315
+ --chat-template-file models/templates/CohereForAI- c4ai-command-r7b-12-2024- tool_use.jinja
314
316
315
317
# Generic format support
316
318
llama-server --jinja -fa -hf bartowski/phi-4-GGUF:Q4_0
317
319
llama-server --jinja -fa -hf bartowski/gemma-2-2b-it-GGUF:Q8_0
318
320
llama-server --jinja -fa -hf bartowski/c4ai-command-r-v01-GGUF:Q2_K
319
321
```
320
322
323
+ To get the official template from original HuggingFace repos, you can use [ scripts/get_chat_template.py] ( ../scripts/get_chat_template.py ) (see examples invocations in [ models/templates/README.md] ( ../models/templates/README.md ) )
324
+
321
325
> [ !TIP]
322
326
> If there is no official ` tool_use ` Jinja template, you may want to set ` --chat-template chatml ` to use a default that works with many models (YMMV!), or write your own (e.g. we provide a custom [ llama-cpp-deepseek-r1.jinja] ( ../models/templates/llama-cpp-deepseek-r1.jinja ) for DeepSeek R1 distills)
323
327
0 commit comments