-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Fix in LoRA code #2114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix in LoRA code #2114
Conversation
This PR only changes LoRA code. The failing test does not use LoRA. I suspect maybe the atol and rtol for this test may be a little too tight? Or do you see a different explanation? |
I would be fine to relax the tol. a bit... |
The relative difference is large. But when I run this on my Mac, the test passes. Have not tried on GPU. Are we sure the test passes on main? |
I ran the test in question on a GPU instance. It passes before and after this PR:
|
It seems to exclude the two GPU tests. I have no idea why. I don't know how to diagnose this further. |
OK, I commented out the This means this test should either be fixed or commented out. The latter happens on my instance, due to this How to proceed here? @t-vi |
Maybe it is also the case the CI system runs the tests on CPU and this fails. But the CPU tests work for me, both on my Mac laptop and an EC2 instance. The comments in the test don't sound re-assuring. I'd recommend to comment out this test altogether until we are sure the code can be made to do what the HF side is doing as well? |
cc: @t-vi ^^ 🐿️ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The computation of
lora_ind
only works for models withn_head == n_query_groups
andn_embd == n_head * head_size
. This is not the case for the Qwen3-4B model, for example.