Skip to content

[SYCL] Add support for SYCL 2020 in class group #5447

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Feb 14, 2022

Conversation

aobolensk
Copy link
Contributor

@aobolensk aobolensk commented Feb 1, 2022

Implement missing methods for class group according to SYCL 2020 4.9.1.7.

@aobolensk aobolensk requested a review from a team as a code owner February 1, 2022 16:17
@aobolensk aobolensk requested a review from againull February 1, 2022 16:17
@Pennycook
Copy link
Contributor

This change is good, but I think if we're going to add some SYCL 2020 functionality to the group class we should go all the way. If we only partially implement the SYCL 2020 interface, developers might be confused.

The SYCL 2020 group class is documented here: https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#group-class, and I don't think many extra changes are required:

  • Deprecate get_id()
  • Add get_local_id()
  • Add get_max_local_range()
  • Add get_group_linear_id()
  • Add get_local_linear_id()
  • Add get_group_linear_range()
  • Add get_local_linear_range()
  • Add leader()

@aobolensk aobolensk changed the title Add missing get_group_id function in sycl::group Add support for SYCL 2020 in class group Feb 10, 2022
@aobolensk
Copy link
Contributor Author

@Pennycook added functions you asked for in your comment. Also extended tests for group and added test for local_id on device (intel/llvm-test-suite#821).
Limitation: get_local_id() host implementation requires ABI breaking change, so it was stubbed with an exception. On device it works well.

Comment on lines 137 to 142
template <int dims = Dimensions>
typename detail::enable_if_t<(dims == 1), size_t>
get_local_linear_id() const {
id<Dimensions> localId = get_local_id();
return localId[0];
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like @gmlueck to double-check this. Greg, can this be declared in our headers as a template function if it isn't a template function in the specification? My concern here is that this might be user-observable: a user could call get_local_linear_id<Dimensions + 1>, for example, or might have to insert extra template keywords in some situations.

If I'm right to be concerned, I think we can work around this by moving the specialization to a helper function, something like the following:

namespace sycl {
namespace detail {
  size_t get_local_linear_id(sycl::group<0>& g) const {
    return g.get_local_id();
  }

  // + Specializations for group<1> and group<2>

}
}

size_t get_local_linear_id() const {
  return sycl::detail::get_local_linear_id(*this);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it does seem weird that we would implement this function as a template when it is not that way in the spec. It seems like it would be easy to fix by calling some separate templated function as John proposes. I assume the separate function could also be a private member function of group?

class group {
 public:
  size_t get_local_linear_id() const {
    return get_local_linear_id_impl<Dimensions>();
  }

private:
  template <int dims>
  typename detail::enable_if_t<(dims == 1), size_t>
  get_local_linear_id_impl() const {
    id<Dimensions> localId = get_local_id();
    return localId[0];
  }
}

Or, you could use constexpr-if, but that would require the compiler to be C++17.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 211 to 233
template <int dims = Dimensions>
typename detail::enable_if_t<(dims == 1), range<Dimensions>>
get_max_local_range() const {
return range<Dimensions>{
static_cast<size_t>(info::device::max_work_group_size)};
}

template <int dims = Dimensions>
typename detail::enable_if_t<(dims == 2), range<Dimensions>>
get_max_local_range() const {
return range<Dimensions>{
static_cast<size_t>(info::device::max_work_group_size),
static_cast<size_t>(info::device::max_work_group_size)};
}

template <int dims = Dimensions>
typename detail::enable_if_t<(dims == 3), range<Dimensions>>
get_max_local_range() const {
return range<Dimensions>{
static_cast<size_t>(info::device::max_work_group_size),
static_cast<size_t>(info::device::max_work_group_size),
static_cast<size_t>(info::device::max_work_group_size)};
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate the specification could be clearer here, but the intent of this function is to return the maximum number of work-items in any work-group for the kernel that is currently executing, rather than the maximum number of work-items supported by the device.

Because we don't support non-uniform work-group sizes, I think you can replace all of this with:

size_t get_max_local_range() const {
  return get_local_range();
}

The reason both functions exist is that they return different values for the sub_group class. Supporting both in group allows developers to write generic code accepting either a sub_group or group.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

return (index[0] * groupRange[1] * groupRange[2]) +
(index[1] * groupRange[2]) + index[2];
}

bool leader() const { return false; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The specification for this function says:

Return true for exactly one work-item in the work-group, if the calling work-item is the leader of the work-group, and false for all other work-items in the work-group.

The leader of the work-group is determined during construction of the work-group, and is invariant for the lifetime of the work-group. The leader of the work-group is guaranteed to be the work-item with a local id of 0.

Because of the last sentence, I think the correct implementation is:

bool leader() const {
  return (get_local_linear_id() == 0);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

vladimirlaz
vladimirlaz previously approved these changes Feb 11, 2022
@bader bader changed the title Add support for SYCL 2020 in class group [SYCL] Add support for SYCL 2020 in class group Feb 11, 2022
vladimirlaz
vladimirlaz previously approved these changes Feb 11, 2022
@bader
Copy link
Contributor

bader commented Feb 11, 2022

@alexbatashev, take a look at this failure - https://github.com/intel/llvm/runs/5155471215?check_suite_focus=true. I think we should improve the CI scripts to avoid this.

Copy link
Contributor

@Pennycook Pennycook left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for not being clear in my initial review -- my comment about get_local_linear_id() applies to all of the new template functions you've introduced here. I think there are three such functions left, and I've called out the first occurrence of each.

range<Dimensions> get_local_range() const { return localRange; }

size_t get_local_range(int dimension) const { return localRange[dimension]; }

template <int dims = Dimensions>
typename detail::enable_if_t<(dims == 1), size_t>
get_local_linear_range() const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should call a get_local_linear_range_impl().

range<Dimensions> get_group_range() const { return groupRange; }

size_t get_group_range(int dimension) const {
return get_group_range()[dimension];
}

template <int dims = Dimensions>
typename detail::enable_if_t<(dims == 1), size_t>
get_group_linear_range() const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should call a get_group_linear_range_impl().


template <int dims = Dimensions>
typename detail::enable_if_t<(dims == 1), size_t>
get_group_linear_id() const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should call a get_group_linear_id_impl().

@aobolensk
Copy link
Contributor Author

Sorry for not being clear in my initial review -- my comment about get_local_linear_id() applies to all of the new template functions you've introduced here. I think there are three such functions left, and I've called out the first occurrence of each.

Done for rest of the functions

@aobolensk aobolensk requested a review from Pennycook February 11, 2022 15:53
@againull
Copy link
Contributor

@aobolensk Could you please check if this failure is related to your PR:


Failed Tests (1):
SYCL :: Basic/handler/handler_mem_op.cpp

@againull
Copy link
Contributor

@aobolensk Could you please check if this failure is related to your PR:

Failed Tests (1): SYCL :: Basic/handler/handler_mem_op.cpp

Never mind, it is unrelated. Found a PR isolated to L0 plugin which has the same failure: #5541

@bader bader merged commit 73d59ce into intel:sycl Feb 14, 2022
alexbatashev added a commit to alexbatashev/llvm that referenced this pull request Feb 23, 2022
…/llvm into refactor_existing_workflows

* 'refactor_existing_workflows' of github.com:alexbatashev/llvm: (2051 commits)
  [SYCL][L0] Honor property::queue::enable_profiling (intel#5543)
  [SYCL][CI] Enable sccache on Windows (intel#5589)
  [SYCL][Doc] Move internal design docs (intel#5556)
  [sycl-post-link] Initialize the integer Value variable (intel#5585)
  [CI] Fix nightly builds (intel#5584)
  [SYCL][L0] Fix use of copy-engines in L0 interop queue (intel#5579)
  Update OpenCL headers tag to dcd5bed (intel#5575)
  [SYCL] Fix warning for InOrderQueueSyncCheck unit test build (intel#5577)
  [SYCL][HIP] Remove arch requirement for running lit tests (intel#5253)
  [SYCL][L0] Fix timestamp calculation (in ns) (intel#5555)
  [SYCL] Fix sync of host task vs kernel for in-order queue (intel#5551)
  [sycl-post-link] Add a check for device globals with device_image_scope (intel#5517)
  [SYCL] Fix SYCL Kernel Body Check (intel#5546)
  [SYCL] Add support for SYCL 2020 in class group (intel#5447)
  Fix tests after 1c729d7 Use align attribute for kernel pointer arg alignment
  Fix tests after 18834dc Mark pointer-typed kernel arguments as ABI aligned
  [CI] Add experimental Windows build to GitHub Actions nightly (intel#5560)
  [sycl-post-link][NFC] Address clang-tidy concerns in the sycl-post-link (intel#5552)
  Fix lit test after changes in Clang
  Improve backward compatibility for DISubRange
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants