-
Notifications
You must be signed in to change notification settings - Fork 795
Description
Describe the bug
The offload section for -fsycl-targets=nvptx64-nvidia-cuda-sycldevice in .o files used to be called sycl-nvptx64-nvidia-cuda-sycldevice but now it is called sycl-nvptx64-nvidia-cuda-sycldevice-sm_50. There was no corresponding change to the clang-offload-bundle invocations that extract the offload bundles; they still use the old name. Therefore the unbundling does not extract any offload bundles, so the device code compilation at link time doesn't compile any user code, and the program fails at runtime since it doesn't have the needed device kernels.
This happened with the pulldown July 9: 033ff5e
I don't see why the name was changed. If there is a good reason for it, then it needs to be done consistently at least, because this is a showstopper.
To Reproduce
Minimal example doesn't need to execute. Create these files:
$ cat test.cpp void dummy(){} $ cat main.cpp void dummy(); int main() { dummy(); }
Then compile them separately, look at the offload bundles' IDs:
$ clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda-sycldevice -c main.cpp test.cpp clang-13: warning: Unknown CUDA version. cuda.h: CUDA_VERSION=11030. Assuming the latest supported version 10.1 [-Wunknown-cuda-version] $ clang-offload-bundler -type=o -inputs=main.o -list sycl-nvptx64-nvidia-cuda-sycldevice-sm_50 host-x86_64-unknown-linux-gnu $ clang-offload-bundler -type=o -inputs=test.o -list sycl-nvptx64-nvidia-cuda-sycldevice-sm_50 host-x86_64-unknown-linux-gnu
Now examine the linker flow:
$ clang++ -### -fsycl -fsycl-targets=nvptx64-nvidia-cuda-sycldevice main.o test.o >& toolchain.log $ grep unbundle toolchain.log "/home/larry/sycl-with-cuda/llvm-build/install/bin/clang-offload-bundler" "-type=o" "-targets=host-x86_64-unknown-linux-gnu,sycl-nvptx64-nvidia-cuda-sycldevice" "-inputs=main.o" "-outputs=/tmp/main-0b169c.o,/tmp/main-94af99.o" "-unbundle" "-allow-missing-bundles" ...
You can see that the unbundle invocation is using the old sycl-nvptx64-nvidia-cuda-sycldevice name and therefore no offload bundles are extracted.
I used the --cuda flag to buildbot when building the compiler. If you want a convenient executable you can use BabelSttream for SYCL.
Environment (please complete the following information):
- OS: Linux
- Target device and vendor:NVIDIA GPU
- DPC++ version: 033ff5e
- Dependencies version:None