-
Notifications
You must be signed in to change notification settings - Fork 11.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eval bug: Gemma3 <unused32> spam #12433
Comments
Same, at some point of dialogue where there image in context, model will only spit out <unusedNN> tokens |
Question: is this bug still there if we don't use vision? (i.e. text-only) I'm trying to reduce the scope of this bug, it can be in one of these categories:
|
To add some ideas: I didn't observe this bug in koboldcpp_cu12.exe when chatting with images - that is interesting, knowing it uses llama.cpp. In ollama logs i see following error (but it may be another bug). it just crashes after some chat with images: [GIN] 2025/03/14 - 20:24:02 | 200 | 6.0961414s | 127.0.0.1 | POST "/api/chat" |
Update: I tried recent build https://github.com/ggml-org/llama.cpp/releases/tag/b4924 Update 2: No, I still got same bug again after some time and 6 images: |
If I offload 34 of 35 layers of gemma_4b to GPU after a while and 4 images I get similar error. Notice that the tag is changed:
|
yes. in long context situations (50~100k tokens) the unused32 token appears more and more often. regardless of precision. (it also happens at fp16, fully loaded to gpu, with context set to 128k tokens) vllm does not output the unused32 token for the same prompts using gemma3-27b-it |
Name and Version
Operating systems
Windows
GGML backends
CUDA
Hardware
AMD Ryzen 7 5800X 8-Core
NVIDIA GeForce RTX 3090 Ti
NVIDIA GeForce RTX 4060 Ti
Models
gemma-3-4b-it-GGUF/gemma-3-4b-it-Q4_K_M.gguf + gemma-3-4b-it-GGUF/mmproj-model-f16.gguf
https://huggingface.co/ggml-org/gemma-3-4b-it-GGUF
Problem description & steps to reproduce
llama-gemma3-cli
outputs<unused32>
infinitely in certain situations.Reproduction:
Tell me a long story
, to completion:<unused32>
spamFirst Bad Commit
No response
Relevant log output
The text was updated successfully, but these errors were encountered: