Improvements to Quick Tuning #3700

causten · 2024-12-10T18:40:13Z

When benchmarking kernels during the Quick tune (exhaustive as well), the algorithm is to take the average of 10 runs per tried Kernel and then compare to the other configs. The Winning Kernel config is the one with the best average time.

The complete 10 runs are not recorded. The goal here is to capture all the times runs and print the min, max, and median.
Lastly add a capability to change the picking algorithm from Average to .... Min, or Median

As an example what we see today...

MIGRAPHX_TRACE_BENCHMARKING=2 MIGRAPHX_TRACE_MLIR=2

Problem: gfx1150 12 -t f16 -out_datatype f16 -transA false -transB true -g 1 -m 1 -n 4096 -k 4096
Benchmarking solution: v2:16,256,4,16,64,4,1,1,1 => ((16256) / (1664)) * 32 = 128
2.6971ms

What we would like to see...
Problem: gfx1150 12 -t f16 -out_datatype f16 -transA false -transB true -g 1 -m 1 -n 4096 -k 4096
Benchmarking solution: v2:16,256,4,16,64,4,1,1,1 => ((16256) / (1664)) * 32 = 128
2.6971ms, 2.0 min, 23.9 max, 2.50 med

pfultz2 · 2024-12-10T20:25:46Z

The complete 10 runs are not recorded. The goal here is to capture all the times runs and print the min, max, and median.
Lastly add a capability to change the picking algorithm from Average to .... Min, or Median

This is not possible. We dont time each run on purpose because we want to minimize the launch overhead when we are benchmarking so we can get closer to the actual device time.

causten · 2024-12-11T01:03:03Z

Time to start. We are still seeing too many inconsistencies with driver results and no answer as to why. This will help

causten assigned richagadgil Dec 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvements to Quick Tuning #3700

Improvements to Quick Tuning #3700

causten commented Dec 10, 2024 •

edited

Loading

pfultz2 commented Dec 10, 2024

causten commented Dec 11, 2024

Improvements to Quick Tuning #3700

Improvements to Quick Tuning #3700

Comments

causten commented Dec 10, 2024 • edited Loading

pfultz2 commented Dec 10, 2024

causten commented Dec 11, 2024

causten commented Dec 10, 2024 •

edited

Loading