-
Notifications
You must be signed in to change notification settings - Fork 43
Add support for extended data types in DART #462
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…tination data types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works for me. No performance changes for my examples. So everything fine. Nice work btw.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you changed the testing parameters?
Because according to CI the longest running test is HaloMatrixWrapperCyclic3D (>90s IIRC, now down to <30s). Is that a problem? It looked like your tests are designed to be flexible ;) |
it's flexible :) - With 150 elems per dimension it's possible to run tests with more ranks. But 100 is fine too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test parameter changes - solved
That's actually a good point. Maybe we should set it to |
I had this idea before, but you will get a problem with the reference matrix which is done on rank 0 sequentially. The bigger the NArray, the more memory is used and at some point its not enough memory left on one node. |
OK I see, but to make it resilient we should add a cut-off to avoid unpleasant crashes in the tests when scaling the number of nodes (I will curse any author of tests that require debugging at scale... ;)) |
It's on my Todo list. I will make it resilient, so you don't need to start playing with your voodoo dolls :) . |
This PR adds support for strided and indexed data types / data access patterns in DART. Extended data types are mapped to MPI types and handled by MPI.
New Functions
The DART interface is extended with the following functions:
Create a strided datatype. To keep the type flexible, the underlying MPI type is created on-the-fly based on the number of elements the caller wants to copy. The number of elements in this type is determined by a single block length.
Creates an indexed data type containing
count
blocks defined by individual offsets and block lengths. Thebasetype
has to be a basic type, hence no nesting of extended types is allowed atm. This type is mapped directly toMPI_Type_indexed
. The number of elements in this type is determined by the sum of all block lengths.Destroys a type. This may be done before the completion of pending operations.
Extended Function Interfaces
All variants of
dart_put
anddart_get
have been extended by a second type parameter. The types now describe the data type from which to copy and to which to copy (similar to MPI but with slightly different naming and parameter ordering). DART does not perform conversion between base types and hence errors out if the base types do not match. In contrast to MPI, we keep a single parameter for the number of elements. It is left to the caller to ensure that no truncation happens with both source and destination type. The user may mix both strided and indexed data types as long as no truncation occurs.Example
The following would result in an error:
Misc
In the process, I tried to clean up the DART code surrounding the put/get functions to improve readability and reduce code duplication.
The Halo code has been adapted to the new interface. The strided and indexed get functions introduced with #452 have been removed.
Strided
dash::copy
has not been implemented as the copy code is under development in #410.Closes #436.
Note: CI fails due to a bug in NastyMPI (dash-project/nasty-MPI#3)