Skip to content

[SYCL][Bindless] Update and add support for SPV_INTEL_bindless_image extension new revision #13753

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
3 changes: 2 additions & 1 deletion clang/lib/Driver/ToolChains/Clang.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -10394,7 +10394,8 @@ static void getOtherSPIRVTransOpts(Compilation &C,
",+SPV_INTEL_fpga_argument_interfaces"
",+SPV_INTEL_fpga_invocation_pipelining_attributes"
",+SPV_INTEL_fpga_latency_control"
",+SPV_INTEL_task_sequence";
",+SPV_INTEL_task_sequence"
",+SPV_INTEL_bindless_images";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also update clang/test/Driver/sycl-spirv-ext.c to cover your new entry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I missed that. Will do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

ExtArg = ExtArg + DefaultExtArg + INTELExtArg;
if (C.getDriver().IsFPGAHWMode())
// Enable several extensions on FPGA H/W exclusively
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
:capability_token: 6528
:handle_to_image_token: 6529
:handle_to_sampler_token: 6530
:handle_to_sampled_image_token: 6531

SPV_INTEL_bindless_images
=========================
Expand Down Expand Up @@ -37,8 +38,8 @@ In Development

[width="40%",cols="25,25"]
|========================================
| Last Modified Date | 2024-03-25
| Revision | 6
| Last Modified Date | 2024-05-01
| Revision | 7
|========================================

== Dependencies
Expand All @@ -52,7 +53,7 @@ This extension requires SPIR-V 1.0.

This extension adds support for bindless images.
This is done by adding support for SPIR-V to convert unsigned integer handles to
images/samplers.
images, samplers and sampled images.

Bindless images are a feature that provides flexibility on how images are
accessed and used, such as removing limitations on how many images can be
Expand Down Expand Up @@ -84,6 +85,7 @@ Instructions added under *BindlessImagesINTEL* capability.
----
OpConvertHandleToImageINTEL
OpConvertHandleToSamplerINTEL
OpConvertHandleToSampledImageINTEL
----

== Token Number Assignments
Expand All @@ -93,9 +95,10 @@ OpConvertHandleToSamplerINTEL
[cols="70%,30%"]
[grid="rows"]
|====
|BindlessImagesINTEL |{capability_token}
|OpConvertHandleToImageINTEL |{handle_to_image_token}
|OpConvertHandleToSamplerINTEL |{handle_to_sampler_token}
|BindlessImagesINTEL |{capability_token}
|OpConvertHandleToImageINTEL |{handle_to_image_token}
|OpConvertHandleToSamplerINTEL |{handle_to_sampler_token}
|OpConvertHandleToSampledImageINTEL |{handle_to_sampled_image_token}
|====
--

Expand Down Expand Up @@ -134,6 +137,21 @@ _Result type_ must be an `OpTypeSampler`.
'<id> Operand'
|======

[cols="2*1,3*2"]
|======
5+|[[OpConvertHandleToSampledImageINTEL]]*OpConvertHandleToSampledImageINTEL* +
+
Converts an unsigned integer pointed by _Operand_ to sampled image type.

Unsigned integer is either a 32 or 64 bit unsigned integer.
Depending on if the addressing model is set to *Physical32* or *Physical64*.

_Result type_ must be an `OpTypeSampledImage`.

| 4 | {handle_to_sampled_image_token} | '<id> Result Type' | 'Result <id>' |
'<id> Operand'
|======

Modify Section 3.31, Capability, adding row to the capability table:

[width="40%"]
Expand Down Expand Up @@ -164,6 +182,7 @@ None Yet.
instruction and clarify return types
|6|2024-03-25|Duncan Brawley| Wording/formatting improvements, clarify sections
edited, make capability addition explicit and
substitute instruction numbers
substitute instruction numbers
|7|2024-05-01|Duncan Brawley| Add OpConvertHandleToSampledImageINTEL instruction
|========================================

12 changes: 12 additions & 0 deletions sycl/include/CL/__spirv/spirv_ops.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -230,6 +230,18 @@ template <typename SampledType, typename TempRetT, typename TempArgT>
extern __DPCPP_SYCL_EXTERNAL TempRetT __spirv_ImageSampleCubemap(SampledType,
TempArgT);

template <typename RetT>
extern __DPCPP_SYCL_EXTERNAL RetT
__spirv_ConvertHandleToImageINTEL(unsigned long);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsigned long is 32bit on windows?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use following code?

template <class RetT, class HandleT>
extern __DPCPP_SYCL_EXTERNAL RetT __spirv_ConvertHandleToImageINTEL(HandleT);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Added.


template <typename RetT>
extern __DPCPP_SYCL_EXTERNAL RetT
__spirv_ConvertHandleToSamplerINTEL(unsigned long);

template <typename RetT>
extern __DPCPP_SYCL_EXTERNAL RetT
__spirv_ConvertHandleToSampledImageINTEL(unsigned long);

#define __SYCL_OpGroupAsyncCopyGlobalToLocal __spirv_GroupAsyncCopy
#define __SYCL_OpGroupAsyncCopyLocalToGlobal __spirv_GroupAsyncCopy

Expand Down
154 changes: 123 additions & 31 deletions sycl/include/sycl/ext/oneapi/bindless_images.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -50,12 +50,24 @@ struct sampled_image_handle {

sampled_image_handle() : raw_handle(~0) {}

sampled_image_handle(raw_image_handle_type raw_image_handle)
: raw_handle(raw_image_handle) {}
sampled_image_handle(raw_image_handle_type handle) : raw_handle(handle) {}

raw_image_handle_type raw_handle;
};

// Image types used for generating SPIR-V
#ifdef __SYCL_DEVICE_ONLY__
template <int NDims>
using OCLImageTyRead =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this part of the specification or should it be moved into the detail namespace?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah it should be in the detail namespace. Moved it.

typename sycl::detail::opencl_image_type<NDims, sycl::access::mode::read,
sycl::access::target::image>::type;

template <int NDims>
using OCLImageTyWrite =
typename sycl::detail::opencl_image_type<NDims, sycl::access::mode::write,
sycl::access::target::image>::type;
#endif

/**
* @brief Allocate image memory based on image_descriptor
*
Expand Down Expand Up @@ -792,6 +804,43 @@ template <typename DataT> constexpr bool is_recognized_standard_type() {
std::is_floating_point_v<DataT> || std::is_same_v<DataT, sycl::half>);
}

#ifdef __SYCL_DEVICE_ONLY__

// Macros are required because it is not legal for a function to return
// a variable of type 'opencl_image_type'.
#if defined(__NVPTX__)
#define CONVERT_HANDLE_TO_IMAGE(raw_handle, ImageType) raw_handle
#elif defined(__SPIR__)
#define CONVERT_HANDLE_TO_IMAGE(raw_handle, ImageType) \
__spirv_ConvertHandleToImageINTEL<ImageType>(raw_handle)
#else
#define CONVERT_HANDLE_TO_IMAGE(raw_handle, ImageType) raw_handle
#endif

#if defined(__NVPTX__)
#define CONVERT_HANDLE_TO_SAMPLED_IMAGE(raw_handle, ImageType) raw_handle
#elif defined(__SPIR__)
#define CONVERT_HANDLE_TO_SAMPLED_IMAGE(raw_handle, ImageType) \
__spirv_ConvertHandleToSampledImageINTEL< \
typename sycl::detail::sampled_opencl_image_type<ImageType>::type>( \
raw_handle)
#else
#define CONVERT_HANDLE_TO_SAMPLED_IMAGE(raw_handle, ImageType) raw_handle
#endif

#if defined(__NVPTX__)
#define FETCH_UNSAMPLED_IMAGE(DataT, raw_handle, coords) \
__invoke__ImageFetch<DataT>(raw_handle, coords)
#elif defined(__SPIR__)
#define FETCH_UNSAMPLED_IMAGE(DataT, raw_handle, coords) \
__invoke__ImageRead<DataT>(raw_handle, coords)
#else
#define FETCH_UNSAMPLED_IMAGE(DataT, raw_handle, coords) \
__invoke__ImageFetch<DataT>(raw_handle, coords)
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these all the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite. In the CUDA backend, __invoke__ImageFetch is used for a basic unsampled image read. When this is encountered by llvm-spirv, it crashes as it tries to emit, OpImageFetch, which is not implemented in llvm-spirv. But for a basic unsampled image read, we want to emit OpImageRead instead. So __invoke__ImageRead is used when compiling to SPIR-V.

Interestingly, and slightly annoyingly, OpImageFetch must always return a vector of four components. But that is prob not a big deal and is something to think about much later when adding support for fetching data from sampled images.

Also, turns out, llvm-spirv does not have an assert to check if an instruction is fully implemented before trying to call Inst->init(); at line 185 of SPIRVInstruction.h and jumping to random memory. That was a bit annoying.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed one of the redundant branches.


#endif

} // namespace detail

/**
Expand Down Expand Up @@ -826,15 +875,23 @@ DataT fetch_image(const unsampled_image_handle &imageHandle [[maybe_unused]],

#ifdef __SYCL_DEVICE_ONLY__
if constexpr (detail::is_recognized_standard_type<DataT>()) {
return __invoke__ImageFetch<DataT>(imageHandle.raw_handle, coords);
return FETCH_UNSAMPLED_IMAGE(
DataT,
CONVERT_HANDLE_TO_IMAGE(imageHandle.raw_handle,
OCLImageTyRead<coordSize>),
coords);

} else {
static_assert(sizeof(HintT) == sizeof(DataT),
"When trying to read a user-defined type, HintT must be of "
"the same size as the user-defined DataT.");
static_assert(detail::is_recognized_standard_type<HintT>(),
"HintT must always be a recognized standard type");
return sycl::bit_cast<DataT>(
__invoke__ImageFetch<HintT>(imageHandle.raw_handle, coords));
return sycl::bit_cast<DataT>(FETCH_UNSAMPLED_IMAGE(
HintT,
CONVERT_HANDLE_TO_IMAGE(imageHandle.raw_handle,
OCLImageTyRead<coordSize>),
coords));
}
#else
assert(false); // Bindless images not yet implemented on host
Expand Down Expand Up @@ -907,10 +964,15 @@ DataT fetch_image(const sampled_image_handle &imageHandle [[maybe_unused]],

#ifdef __SYCL_DEVICE_ONLY__
if constexpr (detail::is_recognized_standard_type<DataT>()) {
return __invoke__SampledImageFetch<DataT>(imageHandle.raw_handle, coords);
return __invoke__SampledImageFetch<DataT>(
CONVERT_HANDLE_TO_SAMPLED_IMAGE(imageHandle.raw_handle,
OCLImageTyRead<coordSize>),
coords);
} else {
return sycl::bit_cast<DataT>(
__invoke__SampledImageFetch<HintT>(imageHandle.raw_handle, coords));
return sycl::bit_cast<DataT>(__invoke__SampledImageFetch<HintT>(
CONVERT_HANDLE_TO_SAMPLED_IMAGE(imageHandle.raw_handle,
OCLImageTyRead<coordSize>),
coords));
}
#else
assert(false); // Bindless images not yet implemented on host.
Expand Down Expand Up @@ -954,10 +1016,15 @@ DataT sample_image(const sampled_image_handle &imageHandle [[maybe_unused]],

#ifdef __SYCL_DEVICE_ONLY__
if constexpr (detail::is_recognized_standard_type<DataT>()) {
return __invoke__ImageRead<DataT>(imageHandle.raw_handle, coords);
return __invoke__ImageRead<DataT>(
CONVERT_HANDLE_TO_SAMPLED_IMAGE(imageHandle.raw_handle,
OCLImageTyRead<coordSize>),
coords);
} else {
return sycl::bit_cast<DataT>(
__invoke__ImageRead<HintT>(imageHandle.raw_handle, coords));
return sycl::bit_cast<DataT>(__invoke__ImageRead<HintT>(
CONVERT_HANDLE_TO_SAMPLED_IMAGE(imageHandle.raw_handle,
OCLImageTyRead<coordSize>),
coords));
}
#else
assert(false); // Bindless images not yet implemented on host.
Expand Down Expand Up @@ -1026,15 +1093,20 @@ DataT sample_mipmap(const sampled_image_handle &imageHandle [[maybe_unused]],

#ifdef __SYCL_DEVICE_ONLY__
if constexpr (detail::is_recognized_standard_type<DataT>()) {
return __invoke__ImageReadLod<DataT>(imageHandle.raw_handle, coords, level);
return __invoke__ImageReadLod<DataT>(
CONVERT_HANDLE_TO_SAMPLED_IMAGE(imageHandle.raw_handle,
OCLImageTyRead<coordSize>),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wouldn't you put OCLImageTyRead inside CONVERT_HANDLE_TO_SAMPLED_IMAGE? Or better yet inside __invoke__ImageReadLod...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do agree that putting OCLImageTyRead inside CONVERT_HANDLE_TO_SAMPLED_IMAGE is better. But I am not sure about putting it inside __invoke__ImageReadLod. That would require also putting sampled_opencl_image_type inside __invoke__ImageReadLod

Because these functions are used to also create the PTX instructions for the CUDA backend I feel putting it inside __invoke__ImageReadLod would complicate things. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have moved OCLImageTyRead into CONVERT_HANDLE_TO_SAMPLED_IMAGE for now unless something different is decided.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, this file is outside SYCL RT codeownership, so the decision is ultimately yours :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will leave it as is for now. Potentially revisit this later.

coords, level);
} else {
static_assert(sizeof(HintT) == sizeof(DataT),
"When trying to read a user-defined type, HintT must be of "
"the same size as the user-defined DataT.");
static_assert(detail::is_recognized_standard_type<HintT>(),
"HintT must always be a recognized standard type");
return sycl::bit_cast<DataT>(
__invoke__ImageReadLod<HintT>(imageHandle.raw_handle, coords, level));
return sycl::bit_cast<DataT>(__invoke__ImageReadLod<HintT>(
CONVERT_HANDLE_TO_SAMPLED_IMAGE(imageHandle.raw_handle,
OCLImageTyRead<coordSize>),
coords, level));
}
#else
assert(false); // Bindless images not yet implemented on host
Expand Down Expand Up @@ -1070,16 +1142,20 @@ DataT sample_mipmap(const sampled_image_handle &imageHandle [[maybe_unused]],

#ifdef __SYCL_DEVICE_ONLY__
if constexpr (detail::is_recognized_standard_type<DataT>()) {
return __invoke__ImageReadGrad<DataT>(imageHandle.raw_handle, coords, dX,
dY);
return __invoke__ImageReadGrad<DataT>(
CONVERT_HANDLE_TO_SAMPLED_IMAGE(imageHandle.raw_handle,
OCLImageTyRead<coordSize>),
coords, dX, dY);
} else {
static_assert(sizeof(HintT) == sizeof(DataT),
"When trying to read a user-defined type, HintT must be of "
"the same size as the user-defined DataT.");
static_assert(detail::is_recognized_standard_type<HintT>(),
"HintT must always be a recognized standard type");
return sycl::bit_cast<DataT>(
__invoke__ImageReadGrad<HintT>(imageHandle.raw_handle, coords, dX, dY));
return sycl::bit_cast<DataT>(__invoke__ImageReadGrad<HintT>(
CONVERT_HANDLE_TO_SAMPLED_IMAGE(imageHandle.raw_handle,
OCLImageTyRead<coordSize>),
coords, dX, dY));
}
#else
assert(false); // Bindless images not yet implemented on host
Expand Down Expand Up @@ -1224,16 +1300,20 @@ DataT fetch_image_array(const unsampled_image_handle &imageHandle

#ifdef __SYCL_DEVICE_ONLY__
if constexpr (detail::is_recognized_standard_type<DataT>()) {
return __invoke__ImageArrayFetch<DataT>(imageHandle.raw_handle, coords,
arrayLayer);
return __invoke__ImageArrayFetch<DataT>(
CONVERT_HANDLE_TO_IMAGE(imageHandle.raw_handle,
OCLImageTyRead<coordSize>),
coords, arrayLayer);
} else {
static_assert(sizeof(HintT) == sizeof(DataT),
"When trying to fetch a user-defined type, HintT must be of "
"the same size as the user-defined DataT.");
static_assert(detail::is_recognized_standard_type<HintT>(),
"HintT must always be a recognized standard type");
return sycl::bit_cast<DataT>(__invoke__ImageArrayFetch<HintT>(
imageHandle.raw_handle, coords, arrayLayer));
CONVERT_HANDLE_TO_IMAGE(imageHandle.raw_handle,
OCLImageTyRead<coordSize>),
coords, arrayLayer));
}
#else
assert(false); // Bindless images not yet implemented on host.
Expand Down Expand Up @@ -1277,19 +1357,24 @@ DataT fetch_cubemap(const unsampled_image_handle &imageHandle,
template <typename DataT, typename HintT = DataT>
DataT sample_cubemap(const sampled_image_handle &imageHandle [[maybe_unused]],
const sycl::float3 &dirVec [[maybe_unused]]) {
[[maybe_unused]] constexpr size_t NDims = 2;

#ifdef __SYCL_DEVICE_ONLY__
if constexpr (detail::is_recognized_standard_type<DataT>()) {
return __invoke__ImageReadCubemap<DataT, uint64_t>(imageHandle.raw_handle,
dirVec);
return __invoke__ImageReadCubemap<DataT, uint64_t>(
CONVERT_HANDLE_TO_SAMPLED_IMAGE(imageHandle.raw_handle,
OCLImageTyRead<NDims>),
dirVec);
} else {
static_assert(sizeof(HintT) == sizeof(DataT),
"When trying to read a user-defined type, HintT must be of "
"the same size as the user-defined DataT.");
static_assert(detail::is_recognized_standard_type<HintT>(),
"HintT must always be a recognized standard type");
return sycl::bit_cast<DataT>(__invoke__ImageReadCubemap<HintT, uint64_t>(
imageHandle.raw_handle, dirVec));
CONVERT_HANDLE_TO_SAMPLED_IMAGE(imageHandle.raw_handle,
OCLImageTyRead<NDims>),
dirVec));
}
#else
assert(false); // Bindless images not yet implemented on host
Expand Down Expand Up @@ -1318,12 +1403,15 @@ void write_image(unsampled_image_handle imageHandle [[maybe_unused]],

#ifdef __SYCL_DEVICE_ONLY__
if constexpr (detail::is_recognized_standard_type<DataT>()) {
__invoke__ImageWrite((uint64_t)imageHandle.raw_handle, coords, color);
__invoke__ImageWrite(CONVERT_HANDLE_TO_IMAGE(imageHandle.raw_handle,
OCLImageTyWrite<coordSize>),
coords, color);
} else {
// Convert DataT to a supported backend write type when user-defined type is
// passed
__invoke__ImageWrite((uint64_t)imageHandle.raw_handle, coords,
detail::convert_color(color));
__invoke__ImageWrite(CONVERT_HANDLE_TO_IMAGE(imageHandle.raw_handle,
OCLImageTyWrite<coordSize>),
coords, detail::convert_color(color));
}
#else
assert(false); // Bindless images not yet implemented on host
Expand Down Expand Up @@ -1354,13 +1442,17 @@ void write_image_array(unsampled_image_handle imageHandle [[maybe_unused]],

#ifdef __SYCL_DEVICE_ONLY__
if constexpr (detail::is_recognized_standard_type<DataT>()) {
__invoke__ImageArrayWrite(static_cast<uint64_t>(imageHandle.raw_handle),
coords, arrayLayer, color);
__invoke__ImageArrayWrite(
CONVERT_HANDLE_TO_IMAGE(imageHandle.raw_handle,
OCLImageTyRead<coordSize>),
coords, arrayLayer, color);
} else {
// Convert DataT to a supported backend write type when user-defined type is
// passed
__invoke__ImageArrayWrite(static_cast<uint64_t>(imageHandle.raw_handle),
coords, arrayLayer, detail::convert_color(color));
__invoke__ImageArrayWrite(
CONVERT_HANDLE_TO_IMAGE(imageHandle.raw_handle,
OCLImageTyRead<coordSize>),
coords, arrayLayer, detail::convert_color(color));
}
#else
assert(false); // Bindless images not yet implemented on host.
Expand Down
Loading