You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: README.md
+6-5
Original file line number
Diff line number
Diff line change
@@ -47,8 +47,9 @@ VPTQ can compress 70B, even the 405B model, to 1-2 bits without retraining and m
47
47
48
48
49
49
## News
50
-
-**[2025-01-13]** VPTQ is formly support by Transformers in its wheel package release since v4.48.0.
51
-
-**[2024-12-20]** 🚀 **VPTQ ❤️ Huggingface Transformers** VPTQ support has been merged into Huggingface Transformers main branch! Check out the [commit](https://github.com/huggingface/transformers/commit/4e27a4009d3f9d4e44e9be742e8cd742daf074f4#diff-4a073e7151b3f6675fce936a7802eeb6da4ac45d545ad6198be92780f493112bR20) and our Colab example: <atarget="_blank"href="https://colab.research.google.com/github/microsoft/VPTQ/blob/main/notebooks/vptq_hf_example.ipynb"> <imgsrc="https://colab.research.google.com/assets/colab-badge.svg"alt="VPTQ in Colab"/> </a>
50
+
-**[2025-01-18]** VPTQ v0.0.5 released, featuring cmake support and an enhanced build pipeline!
51
+
-**[2025-01-13]** VPTQ is formly support by Transformers in its wheel package release since [v4.48.0](https://github.com/huggingface/transformers/releases/tag/v4.48.0).
52
+
-**[2024-12-20]** 🚀 **VPTQ ❤️ Huggingface Transformers** VPTQ support has been merged into Huggingface Transformers main branch! Check out the [commit](https://github.com/huggingface/transformers/commit/4e27a4009d3f9d4e44e9be742e8cd742daf074f4#diff-4a073e7151b3f6675fce936a7802eeb6da4ac45d545ad6198be92780f493112) and our Colab example: <atarget="_blank"href="https://colab.research.google.com/github/microsoft/VPTQ/blob/main/notebooks/vptq_hf_example.ipynb"> <imgsrc="https://colab.research.google.com/assets/colab-badge.svg"alt="VPTQ in Colab"/> </a>
52
53
-[2024-12-15] 🌐 Open source community contributes [**Meta Llama 3.3 70B @ 1-4 bits** models](https://huggingface.co/collections/VPTQ-community/vptq-llama-33-70b-instruct-without-finetune-675ef82388de8c1c1bef75ab)
53
54
-[2024-11-01] 📦 VPTQ is now available on [PyPI](https://pypi.org/project/vptq/)! You can install it easily using the command: `pip install vptq`.
54
55
-[2024-10-28] ✨ VPTQ algorithm early-released at [algorithm branch](https://github.com/microsoft/VPTQ/tree/algorithm), and checkout the [tutorial](https://github.com/microsoft/VPTQ/blob/algorithm/algorithm.md).
@@ -70,7 +71,7 @@ VPTQ can compress 70B, even the 405B model, to 1-2 bits without retraining and m
70
71
71
72
- CUDA toolkit
72
73
- python 3.10+
73
-
- torch >= 2.2.0
74
+
- torch >= 2.3.0
74
75
- transformers >= 4.44.0
75
76
- Accelerate >= 0.33.0
76
77
- flash_attn >= 2.5.0
@@ -107,7 +108,7 @@ If a release package is not available, you can build the package from the source
107
108
python setup.py build bdist_wheel
108
109
109
110
# Install the built wheel
110
-
pip install dist/vptq-xxx.whl # Replace xxx with the version number
111
+
pip install dist/vptq-{version}.whl # Replace {version} with the version number
0 commit comments