Skip to content

FA3 #3623

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

FA3 #3623

wants to merge 4 commits into from

Conversation

zhaochaoxing
Copy link
Contributor

image

@lvhan028 lvhan028 requested review from grimoire and RunningLeon June 10, 2025 03:20
@lvhan028 lvhan028 added the enhancement New feature or request label Jun 10, 2025
@grimoire
Copy link
Collaborator

lmdeploy chat --backend pytorch deepseek-ai/DeepSeek-V2-Lite-Chat

failed on second round chat

@zhaochaoxing
Copy link
Contributor Author

lmdeploy chat --backend pytorch deepseek-ai/DeepSeek-V2-Lite-Chat

failed on second round chat

The bug has been resolved.

Copy link
Collaborator

@grimoire grimoire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@RunningLeon RunningLeon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lvhan028 lvhan028 self-requested a review June 13, 2025 12:16
@lvhan028
Copy link
Collaborator

I'd like to hold this PR for a while.
We'd better conduct the evaluation test of some hot models in the env that install FA3

@@ -202,7 +202,8 @@ def flatten_kv_cache(k_caches: Tensor,
k_scales_zeros: Tensor = None,
v_scales_zeros: Tensor = None,
quant_policy: Literal[0, 4, 8] = 0,
kv_layout: str = 'bshd'):
kv_layout: str = 'bshd',
flatten_kv_layout: str = 'bhsd'):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since output is 3d continuous batching tensors, I think 'hsd' is better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants