-
Notifications
You must be signed in to change notification settings - Fork 795
Closed
Labels
bugSomething isn't workingSomething isn't workingcompilerCompiler related issueCompiler related issuecudaCUDA back-endCUDA back-end
Description
When compiling with -fsycl-targets=nvptx64-nvidia-opencl-sycldevice,spir64-unknown-opencl-sycldevice
binaries are created for nvptx and spir64.
The problem is that for nvptx binaries the EntriesTable is not distributed and thus program_manager will not choose the right binary.
How to reproduce
sycl/source/detail/program_manager.cpp:40
static constexpr int DbgProgMgr = 10;
dpc++ -Wno-unknown-cuda-version -fsycl -fsycl-targets=nvptx64-nvidia-opencl-sycldevice,spir64-unknown-opencl-sycldevice -o sycl sycl.cpp
SYCL_BE=PI_CUDA ./sycl
Result
Running on
nvidia
GeForce RTX 2070 (NVIDIA Corporation/PI 0.0): 10
SYCL host device (/1.2): -1
>>> ProgramManager::getOrCreateKernel(-1, 0x23c51d0, _ZTSZZ4mainENK3$_0clERN2cl4sycl7handlerEE10FillBuffer)
>>> ProgramManager::getDeviceImage(-1, "2", 0x23c51d0)
available device images:
++++++ Kernel set: 2
--- Image 0x40ff70
Version : 2
Kind : 4
Format : 0
Target : spir64
Bin size : 15588
Compile options :
Link options :
Entries : _ZTSZZ4mainENK3$_0clERN2cl4sycl7handlerEE10FillBuffer
Properties [0x40ff30-0x40ff48]:
Category SYCL/specialization constants [0-0]:
OSModuleHandle=-1
++++++ Kernel set: 1
--- Image 0x40fe40
Version : 2
Kind : 4
Format : 0
Target : nvptx64
Bin size : 3024
Compile options :
Link options :
Entries :
Properties [0-0]:
OSModuleHandle=-1
selected device image: 0x40ff70
--- Image 0x40ff70
Version : 2
Kind : 4
Format : 0
Target : spir64
Bin size : 15588
Compile options :
Link options :
Entries : _ZTSZZ4mainENK3$_0clERN2cl4sycl7handlerEE10FillBuffer
Properties [0x40ff30-0x40ff48]:
Category SYCL/specialization constants [0-0]:
OSModuleHandle=-1
>>> ProgramManager::createPIProgram(0x2378130)
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingcompilerCompiler related issueCompiler related issuecudaCUDA back-endCUDA back-end