-
Notifications
You must be signed in to change notification settings - Fork 8
Do All-to-All collectives *must* synchronize? #971
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The formulation "must synchronize" is probably too much, but it will not be weaker than "Based on their semantics ... will synchronize (i.e., S1/S2 instead of W1/W2) provided that all counts and the size of all datatypes are larger than zero. Under the given condition (no empty messages !), all participating processes need to contribute something to the resulting buffer content of each process. This means, that all processes need to start the collective operation before any process can complete its operation. So, from Users perspective the function behaves as synchronizing call. |
False, assuming no zero counts. |
Because zero counts are allowed, absolutely no MPI collectives are guaranteed to synchronize except Barrier. |
It states
The question came up - and I doubt that's possible - if there are possible implementations of non-synchronizing "complete" all-to-all collectives. |
Alltoall, allgather and allreduce can return (i.e. complete) in a process only, if all processes of the communicator have provided their input data (provided that count > 0, which is clearly defined in the Remark 18), i.e. have started the routine. That a collective rutine can return in a process only when all processes of the communicator have started the corresponding call, is exactly the defineion of a barrier synchronization. |
Problem
Remark 18 in Appendix A.2 (Summary of the Semantics of all Operation-Related MPI Procedures) states
In the March Forum meeting the question came up whether the hard guarantee (must synchronize) is actually true or if there are cases which can be implemented (in some smart way) that doesn't require synchronization.
Proposal
Check whether that statement is actually true replace it if not.
Changes to the Text
Remove the hard requirement if not necessary
Impact on Implementations
All-to-all collectives could be implemented in a non-synchronizing way
Impact on Users
Users can't rely on synchronization properties of collective routines
References and Pull Requests
The text was updated successfully, but these errors were encountered: