Skip to content
This repository has been archived by the owner on Jun 24, 2024. It is now read-only.

EOS is not read from gguf format #446

Open
Alisa-lisa opened this issue Dec 19, 2023 · 1 comment
Open

EOS is not read from gguf format #446

Alisa-lisa opened this issue Dec 19, 2023 · 1 comment
Assignees

Comments

@Alisa-lisa
Copy link

I have discovered that running the same model with the same parameters from llm (gguf branch) and llama.cpp results in a different behavior. llm seems to have not been reading EOS token and thus the model creates output until max tokens is reached.
Here is llama.cpp:
llamares
And the same model from llm:
llm

According to discord "discussion" it might be indeed a bug.

@philpax philpax self-assigned this Dec 19, 2023
@philpax
Copy link
Collaborator

philpax commented Dec 19, 2023

Thanks for reporting this! For my own reference, the issue is that this doesn't get the EOT from the tokenizer - instead, it assumes that it's the hardcoded token </s>. This made sense in the early days of LLaMA, but is no longer true:

self.tokenizer().id("</s>".as_bytes()).unwrap_or(2)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants