Skip to content

[CUDA] Add gather_qqmm#3757

Open
zcbenz wants to merge 2 commits into
ml-explore:mainfrom
zcbenz:qmm-global-scale
Open

[CUDA] Add gather_qqmm#3757
zcbenz wants to merge 2 commits into
ml-explore:mainfrom
zcbenz:qmm-global-scale

Conversation

@zcbenz

@zcbenz zcbenz commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator

Add global scale support to qmm_naive kernel and use it as fallback for qqmm in CUDA backend. Also add a gather_qqmm op using it as implementation.

@zcbenz zcbenz force-pushed the qmm-global-scale branch from 7b98046 to f8a6d7d Compare June 24, 2026 00:57
@zcbenz zcbenz force-pushed the qmm-global-scale branch from f8a6d7d to ca36a2d Compare June 24, 2026 01:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant