Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add par_reduce_inner functions #1147

Merged
merged 6 commits into from
Aug 26, 2024
Merged

Add par_reduce_inner functions #1147

merged 6 commits into from
Aug 26, 2024

Conversation

bprather
Copy link
Collaborator

All credit for these to @pdmullen! I will fix bugs if they come up though.

At some point @pdmullen sent me some implementations of inner loops for performing reductions, i.e. par_for_inner but for par_reduce. This is useful if reducing to a vector, replacing values with an average, etc.

I am using the 1D variant in production happily, but haven't needed the 2D and 3D versions yet -- it seems difficult to believe they're incorrect though. I'm poking them upstream as a part of reducing the changeset between KHARMA's parthenon and develop, which is now quite small!

It might be useful to have a unit test for these, which I can add soon.

PR Checklist

  • Code passes cpplint
  • New features are documented.
  • Adds a test for any bugs fixed. Adds tests for new features.
  • Code is formatted
  • Changes are summarized in CHANGELOG.md
  • Change is breaking (API, behavior, ...)
    • Change is additionally added to CHANGELOG.md in the breaking section
    • PR is marked as breaking
    • Short summary API changes at the top of the PR (plus optionally with an automated update/fix script)
  • CI has been triggered on Darwin for performance regression tests.
  • Docs build
  • (@lanl.gov employees) Update copyright on changed files

@lroberts36
Copy link
Collaborator

Does #1142 provide this functionality @acreyes?

@acreyes
Copy link
Contributor

acreyes commented Aug 8, 2024

Does #1142 provide this functionality @acreyes?

It doesn't at the moment, but it should be straightforward to add

Copy link
Collaborator

@lroberts36 lroberts36 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved once comment is at least discussed.

Comment on lines 982 to 984
par_reduce_inner(team_mbr_t team_member, const int kl, const int ku, const int jl,
const int ju, const int il, const int iu, const Function &function,
T reduction) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be

par_reduce_inner(InnerLoopPatternTTR, team_mbr_t team_member, const int kl, const int ku, const int jl,
                 const int ju, const int il, const int iu, const Function &function,
                 T reduction) {

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure. I didn't really think about it when dropping these in, and I don't think I fully understand the way loop pattern resolution is done anyway...
The way I understand it, since I use TVR_INNER_LOOP for ThreadVectorRange, these wouldn't resolve under that, right? So, we'd need a TVR and SIMD pattern for these too, if we wanted to make them selectable. Since these should be used pretty rarely, I don't know if the performance benefit of allowing customization is worth three implementations we're just going to stamp over with all the better general stuff you guys are writing for the next release.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see it both ways. With my suggestion, you would have to call parthenon::par_reduce_inner(inner_loop_pattern_ttr_tag, ...) instead of parthenon::par_reduce_inner(...), which would be more explicit that you aren't necessarily using the default inner loop pattern and conform to how the rest of the par_*_inner functions look. But this doesn't change anything about how this behaves.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right, you can explicitly tag loops! It seems better to be explicit, then, let me give it a go today.

Copy link
Collaborator

@pgrete pgrete left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed during the sync last week, we'll merge this now (despite also being added in #1142 as the latter requires some more detailed discussion and this functionality should make it into the immediately upcoming release).

@pgrete pgrete enabled auto-merge (squash) August 26, 2024 11:31
@pgrete pgrete merged commit 83aff4c into develop Aug 26, 2024
53 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants