update benchmark

LittleLittleCloud · LittleLittleCloud · commit 51ea69c65b8c · 2025-02-11T09:40:48.000-08:00
diff --git a/README.md b/README.md
@@ -1,4 +1,4 @@
-# TorchSharp.BitsAndBytes
+﻿# TorchSharp.BitsAndBytes
 The `TorchSharp.BitsAndBytes` is a C# binding library for [bitsandbytes](https://github.com/bitsandbytes-foundation/bitsandbytes) library from Huggingface. It provides 4Bit and 8Bit quantization for TorchSharp models.
 
 ## Usage
@@ -17,4 +17,25 @@ int blockSize = 64; // can be [64, 128, 256, 512, 1024]
 var dequantizedTensor = BitsAndByteUtils.Dequantize4Bit(quantiedTensor, absMax, input.dtype, quantizedDType, n, input.shape, blockSize);
 ```
 
-For more examples, please refer to the *incoming benchmark* project.
+For more examples, please refer to the [Benchmark](#Benchmark) section.
+
+## Benchmark
+```
+
+BenchmarkDotNet v0.14.0, Windows 11 (10.0.26100.3037)
+Intel Core i9-14900K, 1 CPU, 32 logical and 24 physical cores
+.NET SDK 9.0.102
+  [Host]     : .NET 8.0.12 (8.0.1224.60305), X64 RyuJIT AVX2
+  DefaultJob : .NET 8.0.12 (8.0.1224.60305), X64 RyuJIT AVX2
+
+
+```
+| Method         | Mean        | Error     | StdDev    |
+|--------------- |------------:|----------:|----------:|
+| Quantize4Bit   |   536.35 μs | 12.164 μs | 35.290 μs |
+| Dequantize4Bit | 2,257.89 μs | 44.542 μs | 51.294 μs |
+| GEMV_4Bit_FP4  |    84.16 μs |  1.673 μs |  3.223 μs |
+| GEMV_4Bit_NF4  |    82.69 μs |  4.329 μs | 12.629 μs |
+| GEMV_FP32      |    49.59 μs |  0.975 μs |  2.035 μs |
+| GEMM_INT8      | 2,994.86 μs | 12.144 μs | 11.360 μs |
+| GEMM_FP32      | 4,495.49 μs | 35.264 μs | 32.986 μs |