Skip to content

Commit 917f6e8

Browse files
author
Minh Quan Ho
committed
docs: add documentation on async progress thread
Signed-off-by: Minh Quan Ho <[email protected]>
1 parent 31e8553 commit 917f6e8

File tree

3 files changed

+89
-0
lines changed

3 files changed

+89
-0
lines changed

docs/installing-open-mpi/configure-cli-options/misc.rst

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,19 @@ above categories that can be used with ``configure``:
3434
.. danger:: The heterogeneous functionality is currently broken |mdash|
3535
do not use.
3636

37+
* ``--enable-progress-threads``
38+
* ``--disable-progress-threads``:
39+
Enable or disable (default = ``enabled``) support of software-based progress
40+
thread for each MPI process to execute the internal communication progression
41+
engine. Note that even when the support is built, the progress thread is not
42+
spawned by default at runtime. This behavior is controlled by the associated
43+
runtime MCA variable ``opal_async_progress`` or ``mpi_async_progress``
44+
(default = false).
45+
46+
.. warning:: Be aware of performance degradation. Please read
47+
:ref:`this section <async-progress-thread-label>` for
48+
more documentation.
49+
3750
.. _install-wrapper-flags-label:
3851

3952
* ``--with-wrapper-cflags=CFLAGS``

docs/launching-apps/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ same command).
3939
prerequisites
4040
pmix-and-prrte
4141
scheduling
42+
progress_thread
4243

4344
localhost
4445
ssh
Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
.. _async-progress-thread-label:
2+
3+
Asynchronous progress thread
4+
============================
5+
6+
Open MPI provides an experimental support of software-based asynchronous
7+
progress thread. This progress thread is in charge of running internal
8+
progression engine in the background to advance non-blocking overlapping
9+
communication.
10+
11+
Enabling progress thread at configuration time
12+
----------------------------------------------
13+
14+
The feature is can be enabled or disabled at configuration by passing
15+
``--enable-progress-threads`` or ``--disable-progress-threads`` to
16+
``configure``. The default state is enabled.
17+
18+
Enabling progress thread at runtime
19+
-----------------------------------
20+
21+
When Open MPI was configured and built with ``--enable-progress-threads``, the
22+
progress thread is still deactivated at runtime by default.
23+
24+
The progress thread can be activated by setting one of the following
25+
MCA boolean variables in the launching command:
26+
27+
.. code-block:: sh
28+
29+
shell$ mpirun --mca opal_async_progress 1 ...
30+
shell$ mpirun --mca mpi_async_progress 1 ...
31+
shell$ OMPI_MCA_opal_async_progress=1 mpirun ...
32+
shell$ OMPI_MCA_mpi_async_progress=1 mpirun ...
33+
34+
Note that ``mpi_async_progress`` is a synonym of ``opal_async_progress``.
35+
36+
.. warning:: Progress threads are a somewhat complicated issue. Activating them
37+
at run time may improve overlap of communication and computation in
38+
your application (particularly those with non-blocking communication)
39+
which will improve overall performance. But there may be unintended
40+
consequences which may degrade overall application performance.
41+
Users are advised to experiment and see what works best for their
42+
applications.
43+
44+
Rationale
45+
---------
46+
47+
A possible beneficial usecase of software progress thread is *intra-node
48+
shared-memory non-blocking* communication, running on some high core-count CPUs,
49+
on which application may not use all the available cores, or the CPU has some
50+
reserved cores dedicated to communication tasks. In such configurations, the
51+
latency of some non-blocking collective operations (e.g. ``MPI_Ireduce()``)
52+
can be improved thanks to arithmetic operations being performed in the
53+
background by the progress thread, instead of deferring the computations to
54+
being executed by the main thread during ``MPI_Wait()``.
55+
56+
Alternatively, on systems where *inter-node communications* are already
57+
offloaded to dedicated hardware, enabling the software-based progress threads
58+
could degrade performance, since the additional thread will force progress up
59+
through the CPU and potentially away from more optimized hardware functionality.
60+
61+
For these performance reasons, the progress thread is not activated (spawned)
62+
by default at runtime. It is upon developers to decide to switch on the
63+
progress thread, depending on their application and system setup.
64+
65+
Limitations
66+
-----------
67+
68+
#. The current implementation does not support (yet) binding the progress
69+
thread to a specific core (or set of cores).
70+
71+
#. There are still some hard-coded constant parameters in the code that
72+
would require further tuning.
73+
74+
#. It was observed that some multi-threading overhead may impact performance
75+
on small buffers.

0 commit comments

Comments
 (0)