News: The vLLM project has fully adopted AutoAWQ

It is no secret that maintaining a project such as AutoAWQ that has 2+ million downloads, 7000+ models on Huggingface, and 2.1k stars is hard for a solo developer who is doing this in their free time.

Important Notice:

AutoAWQ is officially deprecated and will no longer be maintained.
The last tested configuration used Torch 2.6.0 and Transformers 4.51.3.
If future versions of Transformers break AutoAWQ compatibility, please report the issue to the Transformers project.

Alternative:

AutoAWQ has been adopted by the vLLM Project: https://github.com/vllm-project/llm-compressor
MLX-LM now supports AWQ for Mac devices: http://github.com/ml-explore/mlx-lm

For further inquiries, feel free to reach out:

X: https://x.com/casper_hansen_
LinkedIn: https://www.linkedin.com/in/casper-hansen-804005170/

What's Changed

Add Qwen2.5-VL by @seungwoos in #706
fix setup by @jiqing-feng in #715
Fixed issue with the slow implementation warning by @Egor-Krivov in #711
fix(cache): remove k dim in cache by @neurowelt in #718
fix dtype mismatch by @jiqing-feng in #740
NEWS by @casper-hansen in #750
Added Qwen 3 support. by @zju-stu-lizheng in #751
assertion for non-activated experts in MoE by @casper-hansen in #755
update scripts by @casper-hansen in #756
Add Qwen2.5-Omni support. by @tiger-of-shawn in #759
the final piece by @casper-hansen in #760
the final piece (v2) by @casper-hansen in #761

New Contributors

@seungwoos made their first contribution in #706
@neurowelt made their first contribution in #718
@zju-stu-lizheng made their first contribution in #751
@tiger-of-shawn made their first contribution in #759

Full Changelog: v0.2.8...v0.2.9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v0.2.9

News: The vLLM project has fully adopted AutoAWQ

What's Changed

New Contributors

Contributors

Uh oh!