Skip to content

Benchmarking different ways of performing matrix multiplication.

Notifications You must be signed in to change notification settings

georgantas/benchmark-matrix-multiply

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Explanation

Standard triple for-loop using the CPU.

Performance

  • On Dell XPS 13: FLOPs: 2147483648; Execution time: 5.25 seconds; GFLOPS: 0.4090;

Explanation

Using the CPU with blocking for more temporal and spatial locality. Leverages the L1 cache. More details here.

Performance

  • On Dell XPS 13: FLOPs: 2147483648; Execution time: 3.35 seconds; GFLOPS: 0.6402;

Explanation

Using the GPU with a basic OpenCL kernel.

Performance

  • Platform: NVIDIA TITAN Xp / Device: NVIDIA CUDA: FLOPs: 2147483648; Execution time: 0.04 seconds; GFLOPS: 51.8588;
  • Platform: Intel(R) Core(TM) i7-10710U CPU @ 1.10GHz / Device: Intel(R) OpenCL: FLOPs: 2147483648; Execution time: 0.10 seconds; GFLOPS: 20.7495;

Explanation

Use cuBLAS.

Performance

  • Platform: Tesla K80: FLOPs: 2147483648; Execution time: 0.01 seconds; GFLOPS: 304.7607;
  • Platform: RTX 3090: FLOPs: 2147483648; Execution time: 0.00 seconds; GFLOPS: 773.4415;

Explanation

Load blocks into GPU shared memory to reduce global memory accesses. Explained in detail here.

image

Performance

  • Platform: RTX 3090: FLOPs: 2147483648; Execution time: 0.00 seconds; GFLOPS: 951.0809;

TODO

TransposedBlockMatrixMultiplier

Explanation

Similar to BlockMatrixMultiplier, but load the matrix "B" to memory transposed and use SIMD instructions to perform the block dot products.

Performance

  • TODO

NumpyMatrixMultiplier

Explanation

Multiply with a matrix in python using numpy for comparison.

Performance

  • TODO

GPGPUMatrixMultiplier

Explanation

Multiply with a GPU Shader.

Performance

  • TODO

About

Benchmarking different ways of performing matrix multiplication.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published