Skip to content

[SYCL] Implement user-defined reduction extension #7436

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

dm-vodopyanov
Copy link
Contributor

@dm-vodopyanov dm-vodopyanov commented Nov 17, 2022

@dm-vodopyanov dm-vodopyanov requested a review from a team as a code owner November 17, 2022 17:11
@dm-vodopyanov dm-vodopyanov requested review from romanovvlad and a team November 17, 2022 17:11
dm-vodopyanov added a commit to dm-vodopyanov/llvm-test-suite that referenced this pull request Nov 17, 2022
dm-vodopyanov added a commit to dm-vodopyanov/llvm-test-suite that referenced this pull request Nov 17, 2022
@dm-vodopyanov
Copy link
Contributor Author

/verify with intel/llvm-test-suite#1395


// ---- four reduce_over_group overloads with native binary_op
template <typename GroupHelper, typename T, class BinaryOperation>
sycl::detail::enable_if_t<
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer if constexpr-based implementation but I'm curious what others think.


// ---- reduce_over_group
template <typename GroupHelper, typename T, typename BinaryOperation>
sycl::detail::enable_if_t<(is_group_helper_v<std::decay_t<GroupHelper>>), T>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not make std::decay_t part of is_group_helper?

Comment on lines 132 to 136
sycl::detail::for_each(
g.get_group(), first, last,
[&](const typename sycl::detail::remove_pointer<Ptr>::type &x) {
partial = binary_op(partial, x);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be linear and not logarithmic in complexity. I'd expect something like https://github.com/intel/llvm/blob/sycl/sycl/include/sycl/reduction.hpp#L1810-L1818 might be applicable here, although I'm not sure what gives better performance for different work sizes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to put a comment here for now. We don't yet have any data from actual uses of this new function about which use cases (data type, wg sizes) are the most popular ones.

DominikAdamski and others added 21 commits November 18, 2022 06:08
Update simd construct definition and add mlir tests to prove
that order(concurrent) clause is correctly modeled for simd
construct.
…CTOR_ELT.

1- Enable custom lowering INSERT_VECTOR_ELT to generate code compatible
   to streaming mode.
2- Add testing file:
   insert-vector-elt.ll

Differential Revision: https://reviews.llvm.org/D138222
Bot failure: https://lab.llvm.org/buildbot/#/builders/16/builds/38156

e3cd498 (D133318) updates CloneBlock to
preserve debug use-before-defs. Update local-as-metadata-undominated-use.ll to
reflect this.
…mbine in assignment tracking builds

The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

This reduces peak memory overhead by 15% when building CTMark's tramp3d-v4 with
-O2 -g with assignment tracking enabled.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D133321
And use it findHeaders. findHeaders now finds all header candidates
given a symbol location (these headers will be attached with proper
signals, in a followup patch).

Differential Revision: https://reviews.llvm.org/D137698
This fixes #36373, #36905 and partial of #58685.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D137711
This is useful where tests previously encoded information in the name
names of ranges and points. Currently, this is pretty limited because
names consist of only alphanumeric characters and '_'.

With this patch, we can keep the names simple and attach optional
payloads to ranges and points instead.

The new syntax should be fully backwards compatible (if I haven't missed
anything). I tested this against clangd unit tests and everything still passes.

Differential Revision: https://reviews.llvm.org/D137909
This patch adds bazel tests for llvm-libc.

Some math tests rely on the `mpfr` library. This is controlled via the `--@llvm-project//libc:libc_math_mpfr` flag. It can take three values:
 - `external` (default) will build `mpfr` and `gmp` from source.
 - `system` will use the system installed `mpfr` library.
 - `disable` will skip tests relying on `mpfr`.

Reviewed By: sivachandra, GMNGeoffrey

Differential Revision: https://reviews.llvm.org/D119547
Instead of importing constant expressions recursively, the revision
walks all dependencies of an LLVM constant iteratively. The actual
conversion then iterates over a list of constants and all intermediate
constant values are added to the value mapping. As a result, an LLVM IR
constant maps to exactly one MLIR operation per function. The revision
adapts the existing tests since the constant ordering changed for
aggregate types. Additionally, it adds extra tests that mix aggregate
constants and constant expressions.

Depends on D137416

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D137559
…truction classes

The zip/uzp (2-vector) instruction classes have the incorrect
register constraints and mark the destination as also being an
input. However, the instructions are fully destructive so I've
restructured the classes.

Differential Revision: https://reviews.llvm.org/D138288
…ating a thin archive

As seen in llvm/llvm-project#55023 when a thin
archive is updated when not in the CWD, replacement does not work as
expected. This change fixes the relative file path comparison so the
correct files are updated.

Differential Revision: https://reviews.llvm.org/D138218
…roadcast.

Added dimensions can be both static and dinamic. Mapped dimension should be the same in the input and the init.

Differential Revision: https://reviews.llvm.org/D138291
The backward register scavenger has correct register
liveness information. PEI should leverage the backward register scavenger.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D137574
We don't modify them. Also use auto here since we already wrote the full
type in the cast.
steffenlarsen and others added 17 commits November 28, 2022 14:55
…st-link (intel#7471)

When generating program metadata for `reqd_work_group_size` information
sycl-post-link currently expects the existence of all three dimensions
in the metadata. As a result the compile-time kernel property
`work_group_size` needs to pad itself to adhere to this requirement.
This commit relaxes the requirement by instead allowing fewer than three
metadata operands in both `reqd_work_group_size` and
`work_group_size_hint` when generating the program metadata.
Additionally, padding will no longer be added when converting
"sycl-work-group-size" and "sycl-work-group-size-hint" into the
corresponding metadata.

Signed-off-by: Larsen, Steffen <[email protected]>
Docs CI is currently failing due to an interaction between Python 3.10
and Sphinx 4.1.2. This commit bumps the Sphinx version used by docs CI
to 4.2.0.

Signed-off-by: Larsen, Steffen <[email protected]>
This change makes bfloat16 a supported feature.

Signed-off-by: Rajiv Deodhar <[email protected]>
Signed-off-by: JackAKirk <[email protected]>
ea9ac35 re-adds the llvm::decodeBase64 function, but on our branch
it seems to have been lost in the merge 8b549f6. This change adds
it back to fix build issues introduced by upstream changes that rely on
this function.

This should fix intel#7515.
…7512)

The patch removes standalone splitter we had for `large-grf` and moves
`large-grf` handling into per-aspect splitter.

The change is intended to be non-functional: at most it may affect the
order and names of modules produced by `sycl-post-link`, but not their
content.
…ntel#7505)

The headers use builtins for casting from generic pointers to other
address spaces. However, of these definitions it is missing variants for
non-const volatile. This commit adds these missing definitions.

Signed-off-by: Larsen, Steffen <[email protected]>
bfloat16 was recently moved out of the experimental namespace but
windows_build_test.cpp did not reflect this. This commit removes the use
of the experimental namespace.

Signed-off-by: Larsen, Steffen <[email protected]>
…ty (intel#7523)

Adds a new queue property (hint) for setting its priority in backends
that support it.
Currently supported in Level Zero backend only and ignored for others.
Test: intel/llvm-test-suite#1414

Signed-off-by: Sergey V Maslov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.