-
Notifications
You must be signed in to change notification settings - Fork 908
Description
Thank you for taking the time to submit an issue!
Background information
The UCX OSC component includes an optimization for MPI_Fetch_and_op()
. Unfortunately this optimization leads to incorrect results when mixing MPI_Fetch_and_op()
with MPI_Accumulate()
.
What version of Open MPI are you using? (e.g., v1.10.3, v2.1.0, git branch name and hash, etc.)
master, v3.1.0
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
Built from git checkout
Please describe the system on which you are running
- Operating system/version: Linux nid00020 4.4.49-92.11.1_3.0-cray_ari_c BTL checkpoint friendly #1 SMP Mon Dec 11 23:32:19 UTC 2017 (3.0.99) x86_64 x86_64 x86_64 GNU/Linux
- Computer hardware: Cray XC-40
- Network type: Aries
Details of the problem
See the following program. This program will be placed into MTT today:
https://gist.github.com/hjelmn/c8e54a8a6526b939703a6b894f186bab
The program is simple. Each rank performs an MPI_Accumulate()
of 1024 int32_t's on its left neighbor and an MPI_Fetch_and_op()
on its right neighbor. This is a valid MPI program and it fails with osc/ucx. It passes with osc/rdma.
If this isn't fixed by v3.1.0 I recommend we software-disable the osc/ucx component until it is fixed since it is a correctness issue.