-
Notifications
You must be signed in to change notification settings - Fork 795
[SYCL] Fix big and non-uniform work-groups handling in reduction kernels #2859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
bader
merged 5 commits into
intel:sycl
from
v-klochkov:public_vklochkov_reduction_nd_range_fix
Dec 5, 2020
Merged
[SYCL] Fix big and non-uniform work-groups handling in reduction kernels #2859
bader
merged 5 commits into
intel:sycl
from
v-klochkov:public_vklochkov_reduction_nd_range_fix
Dec 5, 2020
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…rnels This patch also does minor optimization in the main kernels created for reductions. The previous code tried to handle non-uniform work-group sizes and it did it wrong way. That code was removed as it is user's responsibility to provide nd_range that is handled well by the devices, at least for main kernels. The patch conservatively limits the maximum work-group size handled by the reduction implementation to avoid various runtime errors caused by selecting too optimistic work-group size for reductions. This solution is rather temporary until reduction kernels precompilation/query approach is implemented. Signed-off-by: Vyacheslav N Klochkov <[email protected]>
Signed-off-by: Vyacheslav N Klochkov <[email protected]>
alexbatashev
previously approved these changes
Dec 4, 2020
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Pennycook
previously approved these changes
Dec 4, 2020
bader
requested changes
Dec 4, 2020
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, fix CI failures.
1e0e0a3
bader
approved these changes
Dec 5, 2020
alexbatashev
approved these changes
Dec 5, 2020
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This patch also does minor optimization in the main kernels created
for reductions. The previous code tried to handle non-uniform work-group sizes
and it did it wrong way. That code was removed as it is user's responsibility
to provide nd_range that is handled well by the devices, at least for
main kernels.
The patch conservatively limits the maximum work-group size handled by
the reduction implementation to avoid various runtime errors caused
by selecting too optimistic work-group size for reductions. This solution
is rather temporary until reduction kernels precompilation/query approach
is implemented.
Signed-off-by: Vyacheslav N Klochkov [email protected]