-
Notifications
You must be signed in to change notification settings - Fork 39
Bug fix: hpu mrope #167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug fix: hpu mrope #167
Conversation
attafosu
commented
Sep 12, 2025
- HPU Mrope implementation had a bug which was exposed by [Bugfix] Fix platform-specific routing in CustomOp implementations vllm#24444
- Initial workaround was to use the default implementation: [BUGFIX] qwen2.5-vl failed after PR24444, provide a temp solution #162
- This PR fixes the bug in the HPU mrope
Signed-off-by: attafosu <[email protected]>
Signed-off-by: attafosu <[email protected]>
Do we see performance improvement? |
|
||
key_rot = key[..., :self.rotary_dim] | ||
key_pass = key[..., self.rotary_dim:] | ||
key_rot = apply_rotary_pos_emb(key_rot, cos, sin, None, 0, rope_mode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a comparison with existing forward_native, seems major difference is apply_rotary_pos_emb vs apply_rotary_emb_torch, may you check if we do gets perf gain with the oot impl, or we can use native ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I did some quick tests and there's some perf gain over the default:
forward_native: 11.53 tok/sec
forward_oot: 12.32 tok/sec
This is on a smaller sized image and I expect it to be more pronounced on an even bigger input (text or image)
please fix pre-commit |
/run-gaudi-tests |
- HPU Mrope implementation had a bug which was exposed by vllm-project/vllm#24444 - Initial workaround was to use the default implementation: vllm-project#162 - This PR fixes the bug in the HPU mrope --------- Signed-off-by: attafosu <[email protected]> Co-authored-by: Chendi.Xue <[email protected]>