Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DeepSeek-Coder-V2推理警告 #242

Open
Qlalq opened this issue Aug 20, 2024 · 0 comments
Open

DeepSeek-Coder-V2推理警告 #242

Qlalq opened this issue Aug 20, 2024 · 0 comments

Comments

@Qlalq
Copy link

Qlalq commented Aug 20, 2024

Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:10<00:00,  2.52s/it]
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:100001 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
The `seen_tokens` attribute is deprecated and will be removed in v4.41. Use the `cache_position` model input instead.

推理时终端输出如上警告,且无法通过训练集的测试(训练输入A,输出B,实际输入A,输出C),训练loss如图,请问您之前解决过类似的问题吗?
1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant