Skip to content

Commit b84c5a6

Browse files
authored
Introduce FP8 row-based quantization (#194)
* Introduce FP8 row-based quantization * Address lint errors and make tests runnable when CUDA is enabled * Replace missing hardcoded FP8 type and ensure test is not running if Triton is not available
1 parent 773b856 commit b84c5a6

File tree

2 files changed

+790
-0
lines changed

2 files changed

+790
-0
lines changed

0 commit comments

Comments
 (0)