MatMul Fun

Purpose

This is simply documentation of a little learning journey as I explore:

Ways to represent matrices
Algorithms for operations on matrices
Performance optimizations and their trade-offs
Options for accelerating code on a GPU
Impact of programming language on all of the above

TODO

On Cache Coherence

Prediction

On my machine:

$ lscpu | rg L1
L1d cache:                               192 KiB (6 instances)
L1i cache:                               192 KiB (6 instances)

Further, given that each element is an int, sizeof (int) is 4 on my machine, my initial, naive implementation allocates three matrices, and:

>>> 192 * 1024 / 4
49152.0
>>> 192 * 1024 / 4 / 3
16384.0
>>> import math
>>> math.sqrt(16384)
128.0

I expect to observe a non-linearity in runtime around 16,384 elements in each matrix, or dimensions of 128x128.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
src		src
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MatMul Fun

Purpose

TODO

On Cache Coherence

Prediction

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MatMul Fun

Purpose

TODO

On Cache Coherence

Prediction

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages