Skip to content

Collective hanging ibm/allgather on main branch #10318

@wckzhang

Description

@wckzhang

Open MPI main branch (918fe01)

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

Installed from git clone

If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.

e4c20e2 3rd-party/openpmix (v1.1.3-3506-ge4c20e22)
9ae73d4d97f843fac994103f2232f6570baaba26 3rd-party/prrte (psrvr-v2.0.0rc1-4350-g9ae73d4d97)

Please describe the system on which you are running

Amazon Linux 2


Details of the problem

Open MPI main branch hangs when running the following ompi-tests test:

mpirun -np 2 -N 1 --mca btl tcp  --mca coll_base_verbose 100  --hostfile ~/hostfile -x FI_LOG_LEVEL=warn ~/ompi-tests/ibm/collective/intercomm/allgather_inter

I'm pretty sure it's also using the basic collective component which doesn't seem to be intended, maybe we have a bug in component selection.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions