Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(csrc): Optimized inference kernel for vector-quantized gemv. #172

Merged
merged 19 commits into from
Mar 7, 2025

Conversation

lcy-seso
Copy link
Contributor

@lcy-seso lcy-seso commented Feb 6, 2025

This pull request is a work in progress aimed at creating an optimized inference kernel for the vector-quantized GEMV operation.

The operation follows these steps:

  1. Load the codebook into shared memory.
  2. Load tiles for inputs, indices, scales, and bias from shared memory into registers.
  3. Decode the packed indices and dequantize the weights using lookup operations.
  4. Compute the GEMV by performing a reduction sum using warp-level primitives.
  5. With CUTLASS now added as a dependency, update setup.py to automatically check it out as a submodule.

Toward the final goal, this pull request still requires additional unit tests to ensure the correctness of the results. These tests will be added in subsequent pull requests.

@lcy-seso lcy-seso marked this pull request as draft February 6, 2025 02:17
@lcy-seso lcy-seso force-pushed the gemv branch 2 times, most recently from 622e429 to 71f7a12 Compare February 14, 2025 11:20
@lcy-seso lcy-seso force-pushed the gemv branch 2 times, most recently from d027e3d to 84978a6 Compare February 22, 2025 11:38
@lcy-seso lcy-seso marked this pull request as ready for review March 7, 2025 09:31
@lcy-seso lcy-seso changed the title 🚧 Optimized inference kernel for vector-quantized gemv. feat(csrc): Optimized inference kernel for vector-quantized gemv. Mar 7, 2025
@lcy-seso lcy-seso requested a review from YangWang92 March 7, 2025 10:18
@lcy-seso lcy-seso merged commit 6679c42 into microsoft:main Mar 7, 2025
7 checks passed
@lcy-seso lcy-seso deleted the gemv branch March 7, 2025 13:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant