-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add par_reduce_inner
functions
#1147
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved once comment is at least discussed.
src/kokkos_abstraction.hpp
Outdated
par_reduce_inner(team_mbr_t team_member, const int kl, const int ku, const int jl, | ||
const int ju, const int il, const int iu, const Function &function, | ||
T reduction) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these be
par_reduce_inner(InnerLoopPatternTTR, team_mbr_t team_member, const int kl, const int ku, const int jl,
const int ju, const int il, const int iu, const Function &function,
T reduction) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. I didn't really think about it when dropping these in, and I don't think I fully understand the way loop pattern resolution is done anyway...
The way I understand it, since I use TVR_INNER_LOOP
for ThreadVectorRange
, these wouldn't resolve under that, right? So, we'd need a TVR and SIMD pattern for these too, if we wanted to make them selectable. Since these should be used pretty rarely, I don't know if the performance benefit of allowing customization is worth three implementations we're just going to stamp over with all the better general stuff you guys are writing for the next release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could see it both ways. With my suggestion, you would have to call parthenon::par_reduce_inner(inner_loop_pattern_ttr_tag, ...)
instead of parthenon::par_reduce_inner(...)
, which would be more explicit that you aren't necessarily using the default inner loop pattern and conform to how the rest of the par_*_inner
functions look. But this doesn't change anything about how this behaves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right, you can explicitly tag loops! It seems better to be explicit, then, let me give it a go today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed during the sync last week, we'll merge this now (despite also being added in #1142 as the latter requires some more detailed discussion and this functionality should make it into the immediately upcoming release).
All credit for these to @pdmullen! I will fix bugs if they come up though.
At some point @pdmullen sent me some implementations of inner loops for performing reductions, i.e.
par_for_inner
but forpar_reduce
. This is useful if reducing to a vector, replacing values with an average, etc.I am using the 1D variant in production happily, but haven't needed the 2D and 3D versions yet -- it seems difficult to believe they're incorrect though. I'm poking them upstream as a part of reducing the changeset between KHARMA's parthenon and
develop
, which is now quite small!It might be useful to have a unit test for these, which I can add soon.
PR Checklist