You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
This project is great, but i can't understand how it working with Ollama models but not with current widespread quantization standard as GGUF. Ollama itself don't support GGUF, it require to reformat models into it's own format, so it's very niche by itself and support the smallest number of models (mostly the few popular), all others models ever published are quantized in GGUF (also it's easy to do that from huggingface format with llama.cpp itself or change by AUTOGGUF). Also very limited variety of quality levels in ollama models, you won't find many models there in best no loss q8 quality. My model not presented in Ollama models list and i really don't want to reformat 30billions GGUF into ollama format, that's a huge waste of time and require huge storage space by my tests (not counting RAM).
Thanks.
The text was updated successfully, but these errors were encountered:
Hi,
This project is great, but i can't understand how it working with Ollama models but not with current widespread quantization standard as GGUF. Ollama itself don't support GGUF, it require to reformat models into it's own format, so it's very niche by itself and support the smallest number of models (mostly the few popular), all others models ever published are quantized in GGUF (also it's easy to do that from huggingface format with llama.cpp itself or change by AUTOGGUF). Also very limited variety of quality levels in ollama models, you won't find many models there in best no loss q8 quality. My model not presented in Ollama models list and i really don't want to reformat 30billions GGUF into ollama format, that's a huge waste of time and require huge storage space by my tests (not counting RAM).
Thanks.
The text was updated successfully, but these errors were encountered: