-
Notifications
You must be signed in to change notification settings - Fork 906
RFC: Provide equivalence of MPICH_ASYNC_PROGRESS #13088
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -544,9 +544,21 @@ fi | |
AC_DEFINE_UNQUOTED([OPAL_ENABLE_GETPWUID], [$opal_want_getpwuid], | ||
[Disable getpwuid support (default: enabled)]) | ||
|
||
dnl We no longer support the old OPAL_ENABLE_PROGRESS_THREADS. At | ||
dnl some point, this should die. | ||
AC_DEFINE([OPAL_ENABLE_PROGRESS_THREADS], | ||
[0], | ||
[Whether we want BTL progress threads enabled]) | ||
# | ||
# Disable progress threads | ||
# | ||
AC_MSG_CHECKING([if want asynchronous progress threads]) | ||
AC_ARG_ENABLE([progress_threads], | ||
[AS_HELP_STRING([--disable-progress-threads], | ||
[Disable asynchronous progress threads. Note that when enabled, for performance-related reasons, the progress thread is still not spawned by default. User must enable MCA variables 'opal_async_progress' or 'mpi_async_progress' to have the progress thread spawned at runtime. (default: enabled)])]) | ||
if test "$enable_progress_threads" = "no"; then | ||
AC_MSG_RESULT([no]) | ||
opal_want_progress_threads=0 | ||
else | ||
AC_MSG_RESULT([yes]) | ||
opal_want_progress_threads=1 | ||
fi | ||
AC_DEFINE_UNQUOTED([OPAL_ENABLE_PROGRESS_THREADS], [$opal_want_progress_threads], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I saw the comment about maintaining the If there is no bug in the old support for progress thread maintain the code as much as possible. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if you are mentioning this @devreal's comment https://github.com/open-mpi/ompi/pull/13088/files/41a1b44d2d371250adfdd8311cdf566b1162e42c#r2054858614 , I believe his idea was not to remove completely |
||
[Disable BTL asynchronous progress threads (default: enabled)]) | ||
|
||
])dnl |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -39,6 +39,7 @@ same command). | |
prerequisites | ||
pmix-and-prrte | ||
scheduling | ||
progress_thread | ||
|
||
localhost | ||
ssh | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
.. _async-progress-thread-label: | ||
|
||
Asynchronous progress thread | ||
============================ | ||
|
||
Open MPI provides an experimental support of software-based asynchronous | ||
progress thread. This progress thread is in charge of running internal | ||
progression engine in the background to advance non-blocking overlapping | ||
communication. | ||
|
||
Enabling progress thread at configuration time | ||
---------------------------------------------- | ||
|
||
The feature is can be enabled or disabled at configuration by passing | ||
``--enable-progress-threads`` or ``--disable-progress-threads`` to | ||
``configure``. The default state is enabled. | ||
|
||
Enabling progress thread at runtime | ||
----------------------------------- | ||
|
||
When Open MPI was configured and built with ``--enable-progress-threads``, the | ||
progress thread is still deactivated at runtime by default. | ||
|
||
The progress thread can be activated by setting one of the following | ||
MCA boolean variables in the launching command: | ||
|
||
.. code-block:: sh | ||
|
||
shell$ mpirun --mca opal_async_progress 1 ... | ||
shell$ mpirun --mca mpi_async_progress 1 ... | ||
shell$ OMPI_MCA_opal_async_progress=1 mpirun ... | ||
shell$ OMPI_MCA_mpi_async_progress=1 mpirun ... | ||
|
||
Note that ``mpi_async_progress`` is a synonym of ``opal_async_progress``. | ||
|
||
.. warning:: Progress threads are a somewhat complicated issue. Activating them | ||
at run time may improve overlap of communication and computation in | ||
your application (particularly those with non-blocking communication) | ||
which will improve overall performance. But there may be unintended | ||
consequences which may degrade overall application performance. | ||
Users are advised to experiment and see what works best for their | ||
applications. | ||
|
||
Rationale | ||
--------- | ||
|
||
A possible beneficial usecase of software progress thread is *intra-node | ||
shared-memory non-blocking* communication, running on some high core-count CPUs, | ||
on which application may not use all the available cores, or the CPU has some | ||
reserved cores dedicated to communication tasks. In such configurations, the | ||
latency of some non-blocking collective operations (e.g. ``MPI_Ireduce()``) | ||
can be improved thanks to arithmetic operations being performed in the | ||
background by the progress thread, instead of deferring the computations to | ||
being executed by the main thread during ``MPI_Wait()``. | ||
|
||
Alternatively, on systems where *inter-node communications* are already | ||
offloaded to dedicated hardware, enabling the software-based progress threads | ||
could degrade performance, since the additional thread will force progress up | ||
through the CPU and potentially away from more optimized hardware functionality. | ||
|
||
For these performance reasons, the progress thread is not activated (spawned) | ||
by default at runtime. It is upon developers to decide to switch on the | ||
progress thread, depending on their application and system setup. | ||
|
||
Limitations | ||
----------- | ||
|
||
#. The current implementation does not support (yet) binding the progress | ||
thread to a specific core (or set of cores). | ||
|
||
#. There are still some hard-coded constant parameters in the code that | ||
would require further tuning. | ||
|
||
#. It was observed that some multi-threading overhead may impact performance | ||
on small buffers. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -193,9 +193,10 @@ int ompi_mpi_finalize(void) | |
opal_atomic_swap_32(&ompi_mpi_state, | ||
OMPI_MPI_STATE_FINALIZE_PAST_COMM_SELF_DESTRUCT); | ||
|
||
#if OPAL_ENABLE_PROGRESS_THREADS == 0 | ||
opal_progress_set_event_flag(OPAL_EVLOOP_ONCE | OPAL_EVLOOP_NONBLOCK); | ||
#endif | ||
/* shutdown async progress thread before tearing down further services */ | ||
if (opal_async_progress_thread_spawned) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is okay for now but does leave a hole to plug for the sessions model, but since this is an buy-in option for the application user it should be okay for how. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you mind elaborate on this @hppritcha, I fail to see the issue with the session model. |
||
opal_progress_shutdown_async_progress_thread(); | ||
} | ||
|
||
/* NOTE: MPI-2.1 requires that MPI_FINALIZE is "collective" across | ||
*all* connected processes. This only means that all processes | ||
|
Uh oh!
There was an error while loading. Please reload this page.