Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set warmup and tuning iterations through env variable for hipBLASLt #3792

Draft
wants to merge 2 commits into
base: develop
Choose a base branch
from
Draft
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 29 additions & 19 deletions src/targets/gpu/hip_gemm_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -551,7 +551,8 @@ struct hip_gemm_impl
int tune(context& ctx, const std::vector<shape>& input_shapes) // const
{
// tuning meta parameters
const int hot_calls = 40;
const int hot_calls = 1000;
const int cold_calls = 1000;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@causten In case you'd like to change the iterations for the models you are trying this out with:
cold_calls alters warmup iterations.
hot_calls alters tuning iterations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your testing will take a while if you hardcode the values. You'll need to rebuild the code each time. If you allow env variables then you would cut down on recompiling the various scenarios

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the env variables.


std::vector<argument> input_args;
std::transform(input_shapes.begin(),
Expand Down Expand Up @@ -603,32 +604,41 @@ struct hip_gemm_impl
{
auto algo = solution.get_result(ctx, *this, 0)[0].algo;
solution_indices.push_back(hipblaslt_ext::getIndexFromAlgo(algo));
best_sol = hipblaslt_ext::getIndexFromAlgo(algo);
first_time = 1;
best_time = 1;
}
for(auto sol : solution_indices)
else
{
// Warmup: the first call to an op. may not be representative since there is
// more time taken initializing caches, etc. so we won't time it.
run(ctx, input_args, sol);
double host_time = time<milliseconds>([&] {
for([[maybe_unused]] int hc : range(hot_calls))
for(auto sol : solution_indices)
{
// Warmup: the first call to an op. may not be representative since there is
// more time taken initializing caches, etc. so we won't time it.
for([[maybe_unused]] int cc : range(cold_calls)) {
run(ctx, input_args, sol);
ctx.finish();
});
}

double host_time = time<milliseconds>([&] {
for([[maybe_unused]] int hc : range(hot_calls)) {
run(ctx, input_args, sol);
}
ctx.finish();
});

host_time /= hot_calls;
host_time /= hot_calls;

// dev/evaluation only: track time for first solution.
if(first_time < 0)
first_time = host_time;
// dev/evaluation only: track time for first solution.
if(first_time < 0)
first_time = host_time;

// track current best
if(host_time < best_time)
{
best_sol = sol;
best_time = host_time;
// track current best
if(host_time < best_time)
{
best_sol = sol;
best_time = host_time;
}
}
}

std::cout << "Winning GEMM solution: " << best_sol << " in " << best_time << " ms, beats "
<< first_time << "ms" << std::endl;
return best_sol;
Expand Down
Loading