Skip to content

[SYCL] Fix is_device_copyable with range rounding #4478

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Sep 10, 2021

Conversation

DenisBakhvalov
Copy link
Contributor

Suppose we have the following code:

template <> struct is_device_copyable<DeviceCopyable>
                   : std::true_type {};
...
  DeviceCopyable DevCop(0);
  Q.submit([=](sycl::handler& cgh){
    const sycl::range<2> range(1026, 1026);
    cgh.parallel_for(range,[=](sycl::item<2> item) {
      (void)DevCop;
    });
  });

This code doesn't compile because range rounding optimization
wraps kernel lambda function, so we have something like this:

|- WrapperLambda
 |-KernelLambda
 ||-DevCop
 |-Other Wrapper captures

According to the implementation of is_device_copyable,
we check whether all the fields of WrapperLambda are device
copyable. KernelLambda is not device copyable since there is
no corresponding template specialization of is_device_copyable.

It's not possible to provide a template specialization for
is_device_copyable with kernel lambda since lambda types don't
have the name. To fix this issue, we create a functor class
RoundedRangeKernel and provide an is_device_copyable trait for this
class, which simply forwards check to nested kernel lambda.

This patch precommits the test to expose the issue.

Suppose we have the following code:
```
template <> struct is_device_copyable<DeviceCopyable>
                   : std::true_type {};
...
  DeviceCopyable DC(0);
  Q.submit([=](sycl::handler& cgh){
    const sycl::range<2> range(1026, 1026);
    cgh.parallel_for(range,[=](sycl::item<2> item) {
      (void)DC;
    });
  });
```
This code doesn't compile because range rounding optimization
wraps kernel lambda function, so we have something like this:
```
|- WrapperLambda
 |-KernelLambda
 ||-DC
 |-Other Wrapper captures
```
According to the implementation of is_device_copyable,
we check whether all the fields of WrapperLambda are device
copyable. KernelLambda is not device copyable since there is
no corresponding template specialization of is_device_copyable.
It's not possible to provide a template specialization for
is_device_copyable with kernel lambda since lambda types don't
have the name. To fix this issue, we create a functor class
RoundedRangeKernel and provide a is_device_copyable trait for this
class, which simply forwards check to nested kernel lambda.
@DenisBakhvalov
Copy link
Contributor Author

Test that exposes the problem in original code is in the first commit in this PR.

Copy link
Contributor

@v-klochkov v-klochkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. I have 2 questions.
@erichkeane - would you also take a quick look at this fix?

@v-klochkov
Copy link
Contributor

@turinevgeny - please review. This PR requires your approval.

@bader bader merged commit 2d28cd4 into intel:sycl Sep 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants