-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
group: refactor MPIR_Group #7235
base: main
Are you sure you want to change the base?
Conversation
576d5c7
to
0794b2f
Compare
test:mpich/ch3/most |
test:mpich/ch3/most |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good overall. Need some attend w.r.t the reusing of map
across groups.
src/mpi/group/grouputil.c
Outdated
newgrp->rank = rank; | ||
MPIR_Group_set_session_ptr(newgrp, session_ptr); | ||
newgrp->pmap.use_map = true; | ||
newgrp->pmap.u.map = map; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this shallow copy of map
safe? Can MPICH code free a map
before all groups using it are freed? We probably should refcount the map to safely reuse it across multiple MPIR_Group
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The map is owned by the group. The map is never shared. On the other hand, the MPIR_Group can be shared and it is reference counted.
1535d80
to
351cd7f
Compare
test:mpich/ch3/most |
The last commit is likely missing some dependencies from the next PRs. Please review without the last commit. I'll drop the commit if I cannot resolve the issue in time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM up until last commit
This test requires to access MPICH internals, thus won't be used with the current design.
We no longer use this file.
Hide the internal fields of MPIR_Group from unnecessary access. Outside group_util.c and group_impl.c, it only need assume the MPIR_Lpid integer type, creation routines based on lpid map or lpid stride description, and access routine to look up lpid from a group rank.
For most external usages, we only need MPIR_Group_rank_to_lpid.
Avoid access group internal fields.
Group similar functions together to facilitate refactoring. There is no changes in this commit other than moving functions around. The 4 incl/excl functions are very similar. The 3 difference/intersection/union functions are very similar.
Use MPIR_Group_{rank_to_lpid,lpid_to_rank} to avoid directly access MPIR_Group internal fields. For most group creation routines, just populate an lpid lookup map and call MPIR_Group_create_map to create the group.
* add option to use stride to describe group composition * remove the linked list design
The unsigned type uint64_t is dangerous as we perform pmap stride math. Signed integer should always work and we can use assertions to check its range.
* Add check_map_is_strided to detect strided pattern and convert a map into a strided pmap. * Move internal static routines to the bottom of grouputil.c.
test:mpich/ch3/most |
Fixed the last commit. The culprit was |
Pull Request Description
Hide the internal fields of MPIR_Group from unnecessary access.
Outside group_util.c and group_impl.c, it only need assume the MPIR_Lpid
integer type, creation routines based on lpid map or lpid stride
description, and access routine to look up lpid from a group rank.
Add feature to use stride to describe group composition
Remove the linked list design in
MPIR_Group_pmap_t
[skip warnings]
Plan
MPIR_Group
so it can be memory-efficient (strided rank map) -- this PRMPIR_Group
rather than the other way aroundlpid
to be device independent, and device layer perform address exchange upon communicator creation.MPI_COMM_WORLD
.MPIR_Group
to representMPIR_Pset
Author Checklist
Particularly focus on why, not what. Reference background, issues, test failures, xfail entries, etc.
Commits are self-contained and do not do two things at once.
Commit message is of the form:
module: short description
Commit message explains what's in the commit.
Whitespace checker. Warnings test. Additional tests via comments.
For non-Argonne authors, check contribution agreement.
If necessary, request an explicit comment from your companies PR approval manager.