(WIP) sytrd use gemv instead of symv #893

EdDAzevedo · 2025-02-18T20:15:57Z

Code modifications to sytrd() and latrd() to use gemv() (general matrix vector multiply) instead of symv() (symmetric matrix vector multiply).

In some implementations and problem sizes, gemv() may give higher performance compared to symv(), even though symv() should perform only half the work and touch about half the data. The implementation of symv() might also be using atomic update operations.

The changes include:

allocating more work storage in sytrd() to store the conceptually untouched strictly upper triangular or strictly lower triangular part.
invoke kernels in sytrd() to save (on entry) and restore (on exit) the triangular parts.
invoke kernels in latrd() (on entry) to copy the strictly lower triangular part or strictly upper triangular part of matrix to enforce symmetry. This is to allow gemv() to replace calls to symv().
modified xxTRD_BLOCKSIZE from 32 to 64 to reduce the cost for matrix copy for enforcing matrix symmetry in latrd()

On gfx1030, using rocsolver-bench -f sytrd --precision s --iters 5

n	using gemv (us)	using symv (us)
1024	62,201	73,580
2048	137,507	190,302
4096	335,996	522,818
8192	2,161,237	1,925,926

On MI300 (splinter-126-wr-d3, gfx942)

n	using gemv (us)	using symv (us)
1024	40,310	53,154
2048	94,027	157,111
4096	237,926	484,525
8192	689,551	1,683,223

EdDAzevedo added 7 commits February 17, 2025 11:56

add copy_triang

7c0ef69

working snapshot

b8e6662

minor update to symmetrize_body

e8d896a

increase xxTRD_BLOCKSIZE

48212e6

update symmetrize_body

d452a77

option for compact storage

15aa6e6

remove unused code

7f2a6d5

EdDAzevedo requested review from jzuniga-amd, tfalders, cgmb, qjojo, jmachado-amd and AGonzales-amd as code owners February 18, 2025 20:15

EdDAzevedo added 2 commits February 20, 2025 23:10

Merge branch 'rocm_develop' into latrd_use_gemv

94518fe

Merge branch 'rocm_develop' into latrd_use_gemv

9693088

qjojo approved these changes Feb 26, 2025

View reviewed changes

tfalders added the noOptimizations Disable optimized kernels for small sizes for some routines label Feb 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(WIP) sytrd use gemv instead of symv #893

(WIP) sytrd use gemv instead of symv #893

EdDAzevedo commented Feb 18, 2025 •

edited

Loading

(WIP) sytrd use gemv instead of symv #893

Are you sure you want to change the base?

(WIP) sytrd use gemv instead of symv #893

Conversation

EdDAzevedo commented Feb 18, 2025 • edited Loading

EdDAzevedo commented Feb 18, 2025 •

edited

Loading