Skip to content

Commit 51ea69c

Browse files
update benchmark
1 parent dd97e83 commit 51ea69c

File tree

1 file changed

+23
-2
lines changed

1 file changed

+23
-2
lines changed

README.md

+23-2
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# TorchSharp.BitsAndBytes
1+
# TorchSharp.BitsAndBytes
22
The `TorchSharp.BitsAndBytes` is a C# binding library for [bitsandbytes](https://github.com/bitsandbytes-foundation/bitsandbytes) library from Huggingface. It provides 4Bit and 8Bit quantization for TorchSharp models.
33

44
## Usage
@@ -17,4 +17,25 @@ int blockSize = 64; // can be [64, 128, 256, 512, 1024]
1717
var dequantizedTensor = BitsAndByteUtils.Dequantize4Bit(quantiedTensor, absMax, input.dtype, quantizedDType, n, input.shape, blockSize);
1818
```
1919

20-
For more examples, please refer to the *incoming benchmark* project.
20+
For more examples, please refer to the [Benchmark](#Benchmark) section.
21+
22+
## Benchmark
23+
```
24+
25+
BenchmarkDotNet v0.14.0, Windows 11 (10.0.26100.3037)
26+
Intel Core i9-14900K, 1 CPU, 32 logical and 24 physical cores
27+
.NET SDK 9.0.102
28+
[Host] : .NET 8.0.12 (8.0.1224.60305), X64 RyuJIT AVX2
29+
DefaultJob : .NET 8.0.12 (8.0.1224.60305), X64 RyuJIT AVX2
30+
31+
32+
```
33+
| Method | Mean | Error | StdDev |
34+
|--------------- |------------:|----------:|----------:|
35+
| Quantize4Bit | 536.35 μs | 12.164 μs | 35.290 μs |
36+
| Dequantize4Bit | 2,257.89 μs | 44.542 μs | 51.294 μs |
37+
| GEMV_4Bit_FP4 | 84.16 μs | 1.673 μs | 3.223 μs |
38+
| GEMV_4Bit_NF4 | 82.69 μs | 4.329 μs | 12.629 μs |
39+
| GEMV_FP32 | 49.59 μs | 0.975 μs | 2.035 μs |
40+
| GEMM_INT8 | 2,994.86 μs | 12.144 μs | 11.360 μs |
41+
| GEMM_FP32 | 4,495.49 μs | 35.264 μs | 32.986 μs |

0 commit comments

Comments
 (0)