datatype: map builtin MPI datatypes to internal types #7264

hzhou · 2025-01-15T05:49:58Z

Pull Request Description

NOTE: DO NOT MERGE this PR. Split into smaller PRs.

datatype: refactor to use internal types #7303

Many of the MPI Datatypes are redundant, for example, MPI_INT, MPI_INT32_t, MPI_INTEGER. The new ABI proposal requires setting some of these types at runtime, for example, MPI_Abi_set_fortran_info. In this PR, we create a set of fixed-size internal datatypes and then map all external builtin types to internal ones. This allows runtime setting and resetting any builtin types. And internally, we only need to support a fixed set.

MPI_Datatypes are defined by MPI Forum. It mixes alias types, different languages, and unnamed types.
C types are defined by C language. It contains aliases, mixed availability, and compiler variations.
Internal types are defined by MPICH. It is stable and deterministic. Can be used in a switch cases.
The handle bits works with current bit logic for builtin and type size, and also contains index to user-input type.

Discussion

For communication routines, such as MPI_Send, MPI_Bcast, we can replace builtin datatypes with MPIR_FIXED#, since all we care is the data size.
For reduction routines, such as MPI_Reduce, MPI_Accumulate, we can replace builtin datatypes with internal types (but not MPIR_FIXED#)
For datatype creation routines, I think we can replace all builtin "oldtype" with MPIR_FIXED#. We don't need worry about reduction op because we'll always rely on user op for them.
NOTE: once we convert to internal types, we'll not be able to perform strict type matching validation, such as matching MPI_INT to MPI_FLOAT. But we don't perform such validation today anyway. It is an extra overhead that we can't afford. In principle, we could perform strict type matching under e.g. --enable-error-checking=2

[skip warnings]

Author Checklist

Provide Description
Particularly focus on why, not what. Reference background, issues, test failures, xfail entries, etc.
Commits Follow Good Practice
Commits are self-contained and do not do two things at once.
Commit message is of the form: module: short description
Commit message explains what's in the commit.
Passes All Tests
Whitespace checker. Warnings test. Additional tests via comments.
Contribution Agreement
For non-Argonne authors, check contribution agreement.
If necessary, request an explicit comment from your companies PR approval manager.

configure.ac

src/include/mpir_datatype.h

src/mpi/datatype/typeutil.c

configure.ac

src/include/mpi.h.in

dalcinl · 2025-01-22T04:49:19Z

@hzhou Is Intel 32bits (i386) something you do not care at all by now? If not, maybe adding a MPIR_FLOAT12 to support long double may be not that hard.

hzhou · 2025-01-22T15:20:49Z

@hzhou Is Intel 32bits (i386) something you do not care at all by now? If not, maybe adding a MPIR_FLOAT12 to support long double may be not that hard.

We still want to support i386. I am not sure about MPIR_FLOAT12. "long double" is 16-byte, 80-bit, 4-byte alignment on i386. I think we need to make it into its own category, e.g. just call it MPIR_LONG_DOUBLE.

dalcinl · 2025-01-22T17:39:39Z

"long double" is 16-byte

IIRC, "long double" is 12 bytes on on i386 (at least on Linux, but not Windows).
That's why I proposed MPIR_FLOAT12. This type would be NULL on x86_64.
However, MPIR_LONG_DOUBLE would work totally fine, maybe even better, and it meaning would just be "whatever the platform's C long double is", so there is no ambiguity.

hzhou · 2025-01-22T18:08:29Z

Here is my current scheme:

MPIR_FLOAT[2,4,8,16]  /* for IEEE 754 floating points */
MPIR_FLOAT_ALT[2,12,16] /* for alternative format, e.g. BFLOAT16, long double on i386 and x86_64 */

Hopefully we only ever support 1 alternative format. But if more is needed, we'll need new category names.

As you pointed out that long double may have different sizes depend on platforms, but my design goals is to have fixed handle bits and semantics for internal types.

dalcinl · 2025-01-23T06:19:43Z

MPIR_FLOAT[2,4,8,16] /* for IEEE 754 floating points /
MPIR_FLOAT_ALT[2,12,16] / for alternative format, e.g. BFLOAT16, long double on i386 and x86_64 */

This is great for the time being!

Refactor MPIR_Type_match_size_impl and support MPIX_TYPECLASS_LOGICAL. NOTE: now the fixed-width types are always available, we only need match one of fixed-width types. Check whether reduction op is available in case for example we don't have a matching C native type.

* External32 format converts types the original types rather than the internal types. * The error reporting need report to user the original datatypes rather than the internal ones. * Reduce_local need call user op function with the original datatypes.

Update for the new internal pairtypes.

The MPI standard didn't list MPI_BYTE as a valid type for MPI_MAX. However, I bet any coder would think it is a sensible to compare max/min of two byte values. Thus, we (mpich implementation) will allow it. Modify the test to check the error case of MPI_FLOAT and MPI_LAND instead.

Add a missing newline in error messages in reduce_local.c. Limit the number of error messages in atomic_rmw_cas.c.

This has been passed and merged by MPI Forum. Assuming the next MPI standard will be ratified before next major mpich release, we are directly using the MPI_ prefix rather than MPIX_ prefix. Also add MPI_TYPECLASS_LOGICAL for MPI_Type_match_size API. reference: mpi-forum/mpi-issues#699 mpi-forum/mpi-standard#963

This serves as an example how we add a new builtin mpi datatype. 1. define the constant in mpi.h.in 2. (optional) define the internal datatype in mpir_datatype.h if there isn't one already 2a. add alignment in MPIR_Datatype_builtintype_alignment 2b. add mapping in MPII_Typerep_get_yaksa_type 3. define the mapping in configure.ac 4. (optional) define case for the supported reduction op

Provide half-precision float sum operation by casting to C float.

Create MPIR_op_dt_check for reduction op-type validation in the binding-layer, where datatype is user-input mpi datatypes, and use MPIR_Internal_op_dt_check for op-type check where datatype is an internal datatype. The binding-layer validation follows the text literally from the MPI standard, while the internal checks are more flexible. For example, internally byte/char is an integer and are allowed for all integer operations. But externally, users only can use byte for bit-logic operations and char is invalid for any reduction op.

With internal types, the matched_datatype if matched will always be good for op. Thus the check_dtype is no longer necessary.

We centralized the the op/type check with MPIR_op_dt_check and MPIR_Internal_op_dt_check.

Now each op functions are quite simple and share a lot of commonalities, it is easier to maintain moving them into a single source file.

MPI standard does not list MPI_CHAR as a valid type for reduction using builtin ops. Also, multi language types such as MPI_AINT are not allowed in logical ops.

Allowing MPI_CHAR with op or MPI_AINT etc. with logical op are non-standard. However, MPICH used to allow it. Thus, this commit preserves the old behavior.

hzhou · 2025-02-04T20:26:19Z

test:mpich/ch3/most
test:mpich/ch4/most

dalcinl · 2025-02-10T07:03:19Z

confdb/aclocal_datatype.m4

 get_c_float_type() {
    len=$[]1
    pac_retval=
-    for c_type in float double __float128 "long_double" ; do
+    for c_type in float double _Float16 __float128 ; do


There is a problem here. On i386/x86_64, the C "long double" type does not correspond to __float128, but to GCC's type __float80. However, note that sizeof(__float80) == 16 as well as sizeof(__float128) == 16. Therefore a selection logic based on sizeof may not actually work.

To clarify: May concern is about MPI_LONG_DOUBLE. On i386/x86_64, it should map to the GCC type __float80, or what is called "original long double" in section 1.2 here: https://gcc.gnu.org/wiki/Ieee128PowerPC.

I hard-coded long double here - https://github.com/pmodels/mpich/pull/7264/files?show-deleted-files=true&show-viewed-files=true&file-filters%5B%5D=#r1949556370

Your link is not working on my end. However, if you are aware of the issue and MPICH will not eventually confuse "long double" with "__float128", then all good with me.

#7264 (comment)

The link is to the comment below.

hzhou · 2025-02-10T17:13:04Z

configure.ac

+len=`expr $ac_cv_sizeof_double \* 8`
+AC_DEFINE_UNQUOTED([MPIR_DOUBLE_INTERNAL],        [MPIR_FLOAT$len], [Internal type for MPI_DOUBLE])
+len=`expr $ac_cv_sizeof_long_double \* 8`
+AC_DEFINE_UNQUOTED([MPIR_LONG_DOUBLE_INTERNAL],   [MPIR_ALT_FLOAT$len], [Internal type for MPI_LONG_DOUBLE])


Hard code long double to MPIR_ALT_FLOAT{12,16} here, assuming long double is not __float128.

FIXME: the assumption may not be true for some compilers. We need add configure test for the actual floating format. I am kicking the can down the road until some one report the issue.

Indeed, the GCC manual page is laughing in our faces 🤣 🤣 🤣 with option like the following:

-m128bit-long-double -m96bit-long-double -mlong-double-64 -mlong-double-80 -mlong-double-128

We also hard-define MPIR_ALT_FLOAT{12,16} to long double (see comment below), so we are good either way. We can keep adding internal _ALT_ types to deal with any odd types. This won't be good for maintainability, but hopefully, there won't be many. :)

hzhou · 2025-02-10T20:23:46Z

configure.ac

+if test "$ac_cv_sizeof_long_double" -gt 0 ; then
+    len=`expr $ac_cv_sizeof_long_double \* 8`
+    AC_DEFINE_UNQUOTED([MPIR_ALT_FLOAT${len}_CTYPE],[long double], [The C type for MPIR_ALT_FLOAT${len}])
+fi


NOTE: MPIR_ALT_FLOAT{12,16}_CTYPE is hard-coded to long double.

mjwilkins18

Great work @hzhou! Just a few small comments to think about. I do not see any blockers

mjwilkins18 · 2025-02-11T20:02:10Z

src/include/mpir_datatype.h

+#define MPIR_TYPE_ALT_COMPLEX     0x060000      /* e.g. 2-byte Bfloat, 16-byte C long double */
+#define MPIR_TYPE_FORTRAN_LOGICAL 0x070000      /* Fortran logical may need conversions at op */
+
+#define MPIR_FIXED0             ((MPI_Datatype)0x4c800000)      /* 0-size, internally equivalent to MPI_DATATYPE_NULL */


[datatype: add internal datatypes] Should MPIR_FIXED0 move down to be with the other 0-length types and defined as DATATYPE_NULL instead of the value directly?

MPIR_FIXED0 are internal types for 0-byte legitimate types such as MPI_LB/UB. MPIR_INT0 etc. are convenience macros for non-supported MPI types. Thus, they are not the same.

mjwilkins18 · 2025-02-12T20:48:12Z

src/include/mpir_datatype.h

+
+/* builtin pair types - MPI_2INT, MPI_2INTEGER, etc. They have the MPIR_TYPE_PAIR_MASK on.
+ * NOTE: value index types such as MPI_FLOAT_INT internally are not builtin */
+#define MPIR_2INT8              ((MPI_Datatype)0x4cc10200)


[datatype: add internal datatypes] Should we define these relative to MPIR_TYPE_PAIR_MASK and their single types? For example, MPIR_2INT8 = MPIR_TYPE_PAIR_MASK | MPIR_INT8

MPIR_2INT8 != MPIR_TYPE_PAIR_MASK | MPIR_INT8. The size bits are different.

mjwilkins18 · 2025-02-13T15:28:58Z

src/mpi/datatype/pairtypes.c

@@ -6,126 +6,133 @@
 #include <mpiimpl.h>
 #include <stdlib.h>
 #include <limits.h>
+#include "mpir_op_util.h"
+
+#define MPIR_MAX_PAIRTYPES 20   /* some number we know it is sufficient */


[datatype: revamp pairtype creations] I worry this number would be missed if someone added more datatypes later on. Maybe we define NUM_DATATYPES with the MPI datatypes in the previous commits and then set this equal to that?

There is an assertion in create_pairtype so we wouldn't miss it if we ever hit this limit. This is a local macro in pairtypes.c so it should be an easy task to figure out what to do when the assertion triggers. On the other hand, adding long-distance logic, e.g. define the macro in mpir_datatype.h, can be harder to maintain.

mjwilkins18 · 2025-02-13T16:33:19Z

test/mpi/errors/coll/rerr.c

@@ -14,7 +14,8 @@ int verbose = 0;

 int main(int argc, char *argv[])


[test: do not test error case of MPI_BYTE and MPI_MAX] I remember we had a discussion around this issue with MPI_CHAR. Did we already previously allow MPI_BYTE+MPI_MAX? If not, I would not allow new, non-standard functionality in this PR.

Agree. This is an example of the kind of behavior altering changes that it would be nice to isolate in their own PRs so they can be evaluated separately from the revamp/refactoring.

hzhou · 2025-02-13T22:37:50Z

@raffenet @mjwilkins18 I am changing this PR into draft. The first split is here: #7303

hzhou mentioned this pull request Jan 15, 2025

mpi: Add MPI_LOGICAL{1,2,4,8,16} and MPI_TYPECLASS_LOGICAL #6949

Closed

4 tasks