-
Notifications
You must be signed in to change notification settings - Fork 795
[SYCL] Add support for SYCL 2020 in class group #5447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
c384248
to
f4921ad
Compare
This change is good, but I think if we're going to add some SYCL 2020 functionality to the The SYCL 2020 group class is documented here: https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#group-class, and I don't think many extra changes are required:
|
@Pennycook added functions you asked for in your comment. Also extended tests for group and added test for local_id on device (intel/llvm-test-suite#821). |
sycl/include/CL/sycl/group.hpp
Outdated
template <int dims = Dimensions> | ||
typename detail::enable_if_t<(dims == 1), size_t> | ||
get_local_linear_id() const { | ||
id<Dimensions> localId = get_local_id(); | ||
return localId[0]; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like @gmlueck to double-check this. Greg, can this be declared in our headers as a template function if it isn't a template function in the specification? My concern here is that this might be user-observable: a user could call get_local_linear_id<Dimensions + 1>
, for example, or might have to insert extra template
keywords in some situations.
If I'm right to be concerned, I think we can work around this by moving the specialization to a helper function, something like the following:
namespace sycl {
namespace detail {
size_t get_local_linear_id(sycl::group<0>& g) const {
return g.get_local_id();
}
// + Specializations for group<1> and group<2>
}
}
size_t get_local_linear_id() const {
return sycl::detail::get_local_linear_id(*this);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it does seem weird that we would implement this function as a template when it is not that way in the spec. It seems like it would be easy to fix by calling some separate templated function as John proposes. I assume the separate function could also be a private member function of group
?
class group {
public:
size_t get_local_linear_id() const {
return get_local_linear_id_impl<Dimensions>();
}
private:
template <int dims>
typename detail::enable_if_t<(dims == 1), size_t>
get_local_linear_id_impl() const {
id<Dimensions> localId = get_local_id();
return localId[0];
}
}
Or, you could use constexpr-if
, but that would require the compiler to be C++17.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
sycl/include/CL/sycl/group.hpp
Outdated
template <int dims = Dimensions> | ||
typename detail::enable_if_t<(dims == 1), range<Dimensions>> | ||
get_max_local_range() const { | ||
return range<Dimensions>{ | ||
static_cast<size_t>(info::device::max_work_group_size)}; | ||
} | ||
|
||
template <int dims = Dimensions> | ||
typename detail::enable_if_t<(dims == 2), range<Dimensions>> | ||
get_max_local_range() const { | ||
return range<Dimensions>{ | ||
static_cast<size_t>(info::device::max_work_group_size), | ||
static_cast<size_t>(info::device::max_work_group_size)}; | ||
} | ||
|
||
template <int dims = Dimensions> | ||
typename detail::enable_if_t<(dims == 3), range<Dimensions>> | ||
get_max_local_range() const { | ||
return range<Dimensions>{ | ||
static_cast<size_t>(info::device::max_work_group_size), | ||
static_cast<size_t>(info::device::max_work_group_size), | ||
static_cast<size_t>(info::device::max_work_group_size)}; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate the specification could be clearer here, but the intent of this function is to return the maximum number of work-items in any work-group for the kernel that is currently executing, rather than the maximum number of work-items supported by the device.
Because we don't support non-uniform work-group sizes, I think you can replace all of this with:
size_t get_max_local_range() const {
return get_local_range();
}
The reason both functions exist is that they return different values for the sub_group
class. Supporting both in group
allows developers to write generic code accepting either a sub_group
or group
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
sycl/include/CL/sycl/group.hpp
Outdated
return (index[0] * groupRange[1] * groupRange[2]) + | ||
(index[1] * groupRange[2]) + index[2]; | ||
} | ||
|
||
bool leader() const { return false; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The specification for this function says:
Return true for exactly one work-item in the work-group, if the calling work-item is the leader of the work-group, and false for all other work-items in the work-group.
The leader of the work-group is determined during construction of the work-group, and is invariant for the lifetime of the work-group. The leader of the work-group is guaranteed to be the work-item with a local id of 0.
Because of the last sentence, I think the correct implementation is:
bool leader() const {
return (get_local_linear_id() == 0);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
f72fbd2
to
1fc564a
Compare
1fc564a
to
2258997
Compare
@alexbatashev, take a look at this failure - https://github.com/intel/llvm/runs/5155471215?check_suite_focus=true. I think we should improve the CI scripts to avoid this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for not being clear in my initial review -- my comment about get_local_linear_id()
applies to all of the new template functions you've introduced here. I think there are three such functions left, and I've called out the first occurrence of each.
sycl/include/CL/sycl/group.hpp
Outdated
range<Dimensions> get_local_range() const { return localRange; } | ||
|
||
size_t get_local_range(int dimension) const { return localRange[dimension]; } | ||
|
||
template <int dims = Dimensions> | ||
typename detail::enable_if_t<(dims == 1), size_t> | ||
get_local_linear_range() const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should call a get_local_linear_range_impl()
.
sycl/include/CL/sycl/group.hpp
Outdated
range<Dimensions> get_group_range() const { return groupRange; } | ||
|
||
size_t get_group_range(int dimension) const { | ||
return get_group_range()[dimension]; | ||
} | ||
|
||
template <int dims = Dimensions> | ||
typename detail::enable_if_t<(dims == 1), size_t> | ||
get_group_linear_range() const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should call a get_group_linear_range_impl()
.
sycl/include/CL/sycl/group.hpp
Outdated
|
||
template <int dims = Dimensions> | ||
typename detail::enable_if_t<(dims == 1), size_t> | ||
get_group_linear_id() const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should call a get_group_linear_id_impl()
.
7fbb9a7
to
3cba04f
Compare
Done for rest of the functions |
@aobolensk Could you please check if this failure is related to your PR: Failed Tests (1): |
Never mind, it is unrelated. Found a PR isolated to L0 plugin which has the same failure: #5541 |
…/llvm into refactor_existing_workflows * 'refactor_existing_workflows' of github.com:alexbatashev/llvm: (2051 commits) [SYCL][L0] Honor property::queue::enable_profiling (intel#5543) [SYCL][CI] Enable sccache on Windows (intel#5589) [SYCL][Doc] Move internal design docs (intel#5556) [sycl-post-link] Initialize the integer Value variable (intel#5585) [CI] Fix nightly builds (intel#5584) [SYCL][L0] Fix use of copy-engines in L0 interop queue (intel#5579) Update OpenCL headers tag to dcd5bed (intel#5575) [SYCL] Fix warning for InOrderQueueSyncCheck unit test build (intel#5577) [SYCL][HIP] Remove arch requirement for running lit tests (intel#5253) [SYCL][L0] Fix timestamp calculation (in ns) (intel#5555) [SYCL] Fix sync of host task vs kernel for in-order queue (intel#5551) [sycl-post-link] Add a check for device globals with device_image_scope (intel#5517) [SYCL] Fix SYCL Kernel Body Check (intel#5546) [SYCL] Add support for SYCL 2020 in class group (intel#5447) Fix tests after 1c729d7 Use align attribute for kernel pointer arg alignment Fix tests after 18834dc Mark pointer-typed kernel arguments as ABI aligned [CI] Add experimental Windows build to GitHub Actions nightly (intel#5560) [sycl-post-link][NFC] Address clang-tidy concerns in the sycl-post-link (intel#5552) Fix lit test after changes in Clang Improve backward compatibility for DISubRange ...
Implement missing methods for
class group
according to SYCL 2020 4.9.1.7.