This repository was archived by the owner on May 11, 2025. It is now read-only.
News: The vLLM project has fully adopted AutoAWQ
It is no secret that maintaining a project such as AutoAWQ that has 2+ million downloads, 7000+ models on Huggingface, and 2.1k stars is hard for a solo developer who is doing this in their free time.
Important Notice:
- AutoAWQ is officially deprecated and will no longer be maintained.
- The last tested configuration used Torch 2.6.0 and Transformers 4.51.3.
- If future versions of Transformers break AutoAWQ compatibility, please report the issue to the Transformers project.
Alternative:
- AutoAWQ has been adopted by the vLLM Project: https://github.com/vllm-project/llm-compressor
- MLX-LM now supports AWQ for Mac devices: http://github.com/ml-explore/mlx-lm
For further inquiries, feel free to reach out:
What's Changed
- Add Qwen2.5-VL by @seungwoos in #706
- fix setup by @jiqing-feng in #715
- Fixed issue with the slow implementation warning by @Egor-Krivov in #711
- fix(cache): remove k dim in cache by @neurowelt in #718
- fix dtype mismatch by @jiqing-feng in #740
- NEWS by @casper-hansen in #750
- Added Qwen 3 support. by @zju-stu-lizheng in #751
- assertion for non-activated experts in MoE by @casper-hansen in #755
- update scripts by @casper-hansen in #756
- Add Qwen2.5-Omni support. by @tiger-of-shawn in #759
- the final piece by @casper-hansen in #760
- the final piece (v2) by @casper-hansen in #761
New Contributors
- @seungwoos made their first contribution in #706
- @neurowelt made their first contribution in #718
- @zju-stu-lizheng made their first contribution in #751
- @tiger-of-shawn made their first contribution in #759
Full Changelog: v0.2.8...v0.2.9