[SYCL] Support for arrays as kernel parameters #1387

rdeodhar · 2020-03-24T21:37:10Z

Signed-off-by: rdeodhar [email protected]

Signed-off-by: rdeodhar <[email protected]>

Implementation of piEventSetCallback with tests GlueEvent uses now the correct plugins The SYCL RT code for GlueEvent calls now the right plugin to create the event that triggers the dependency chain. Renamed variables to clarify the source code and avoid confusions between Context and Plugin Signed-off-by: Ruyman Reyes <[email protected]> Signed-off-by: Stuart Adams <[email protected]> Signed-off-by: Steffen Larsen <[email protected]>

Signed-off-by: Stuart Adams <[email protected]>

…1381) Signed-off-by: gejin <[email protected]>

NOTE: This flag is not exposed to the driver and not intended for users. It's added to make experiments and identify issues with optimizations. Signed-off-by: Alexey Bader <[email protected]>

By emitting the legacy variant of the LLVM IR alongside the newer representation of the attribute, backwards compatibility with any existing BE implementation is restored. A smooth transition period is thus achieved for the aforementiond BE - until it's able to consume the new LLVM IR, it has an option to simply ignore the unknown metadata. Signed-off-by: Artem Gindinson <[email protected]>

If found alloca command is not sub-buffer alloca, then it's parent alloca which has same context Signed-off-by: Ivan Karachun <[email protected]>

erichkeane · 2020-03-25T15:31:13Z

@romanovvlad As discussed previously via email, I want @rolandschulz to review the design/interface/IR output, so we're waiting on the submitter to provide the design documentation for this. Additionally, @elizabethandrews and @Fznamznon know this section of code better than me, so I'll delegate review to them.

Fznamznon

This patch is not so trivial.
Could you please describe the approach in commit message?
Could you please also add a CodeGen test?

Fznamznon · 2020-03-25T15:41:23Z

clang/test/SemaSYCL/array-kernel-param-run.cpp

@@ -0,0 +1,259 @@
+// This test checks array kernel parameters


It seems that this test checks some kernel execution. We don't put such tests in llvm/clang/test/* directories. Please put somewhere to llvm/sycl/test/ directory.
BTW, SYCL kernels can be run on different devices. Other tests in llvm/sycl/test/ directory has bunch of special RUN lines to run the test on particular device. Like // RUN: %GPU_RUN_PLACEHOLDER %t.out here:

llvm/sycl/test/basic_tests/access_to_subset.cpp

Line 4 in 5d1d716

// RUN: %GPU_RUN_PLACEHOLDER %t.out

. You will need to add them too, if you are doing to add test with kernel execution.

I will write a design document to explain the overall approach to supporting array arguments of kernels. I assume attaching documents here is accepted practice?
I will also address comments posted here in my next attempt.

Sure, we appreciate design docs. You can put your doc somewhere in this folder https://github.com/intel/llvm/tree/sycl/sycl/doc (or extend some existing doc here).

Fznamznon · 2020-03-25T15:42:57Z

clang/test/SemaSYCL/array-kernel-param-neg.cpp

+// This test checks if compiler reports compilation error on an attempt to pass
+// a struct with non-trivially copyable type as SYCL kernel parameter.


Do you mean array of non-trivially copyable objects, right?

Yes, comment changed.

Fznamznon · 2020-03-25T15:46:31Z

clang/lib/Sema/SemaSYCL.cpp

@@ -1095,6 +1225,32 @@ static void populateIntHeader(SYCLIntegrationHeader &H, const StringRef Name,
              populateHeaderForWrappedAccessors(FldType,
                                                Offset + OffsetInWrapper);
            }
+          } else if (FldType->isConstantArrayType()) {
+            auto CAT = cast<ConstantArrayType>(FldType);
+            auto ET = CAT->getElementType();


Please do not use auto when the type is not obvious from the context.
See https://llvm.org/docs/CodingStandards.html#id26
Please apply this comment to all uses of auto in your patch.

Fznamznon · 2020-03-25T15:52:17Z

clang/lib/Sema/SemaSYCL.cpp

+        if (Util::isSyclAccessorType(ET)) {
+          size_t NumElements =
+              static_cast<size_t>(CAT->getSize().getZExtValue());
+          DeclAccessPair FieldDAP = DeclAccessPair::make(Field, AS_none);
+          auto Lhs = MemberExpr::Create(
+              S.Context, KernelObjCloneRef, false, Loc,
+              NestedNameSpecifierLoc(), Loc, Field, FieldDAP,
+              DeclarationNameInfo(Field->getDeclName(), Loc), nullptr,
+              Field->getType(), VK_LValue, OK_Ordinary, NOUR_None);
+          // Treat each element of accessor array as a separate item
+          // Call its __init function
+          for (size_t I = 0; I < NumElements; ++I) {
+            ExprResult IndexExpr = S.ActOnIntegerConstant(Loc, I);
+            auto MA = S.CreateBuiltinArraySubscriptExpr(Lhs, Loc,
+                                                        IndexExpr.get(), Loc);
+            getExprForSpecialSYCLObj(ET, nullptr, MA.get(),
+                                     ET->getAsCXXRecordDecl(), nullptr,
+                                     InitMethodName, BodyStmts);
+            KernelFuncParam++;
+          }


Déjà vu...
The new code in getExprForWrappedAccessorInit lambda looks almost the same. Can we create a function for it?

These two blocks of code are superficially similar. The check for Accessor type and then iterating over the array elements is common. However, the actions in the two cases are very different. It is not practical to make a callable function out of these two blocks of code.

Fznamznon · 2020-03-25T15:57:08Z

clang/lib/Sema/SemaSYCL.cpp

          }
        }
+      } else if (FieldType->isConstantArrayType()) {
+        auto Loc = Field->getLocation();


I don't really get why you are applying source location of Kernel Object field to newly generated AST nodes, I think it can be tricky. Let's apply just empty source location since these AST nodes do not represent real code from source.

OK. This was done based on a previous reviewers suggestion. I have reverted to empty source location to conform to the approach in other cases in this function.

Fznamznon · 2020-03-25T16:01:51Z

clang/lib/Sema/SemaSYCL.cpp

+        };
+        RecordDecl *NewClass;
+        FieldDecl *Field;
+        wrapAnArray(ArgTy, NewClass, Field);


Did I get it right that to pass array of something, we wrap it by some struct (which seems to be generated by the compiler)? i.e. resulting kernel for each array field of kernel object has some wrapped_array struct parameter?

Yes, a design document will be posted.

Fznamznon · 2020-03-25T16:06:16Z

clang/lib/Sema/SemaSYCL.cpp

+      auto CAT = cast<ConstantArrayType>(ArgTy);
+      auto ET = CAT->getElementType();
+      if (Util::isSyclAccessorType(ET)) {
+        size_t NumElements = static_cast<size_t>(CAT->getSize().getZExtValue());
+        const auto *Accessor = ET->getAsCXXRecordDecl();
+        ASTContext &AccessorCtx = Accessor->getASTContext();
+        const ASTRecordLayout &AccessorLayout =
+            AccessorCtx.getASTRecordLayout(Accessor);
+        auto AccessorSize = AccessorCtx.toBits(AccessorLayout.getSize()) / 8;
+        // Start with offset of array, i.e., offset of first element of array
+        uint64_t OffsetInArray = Offset;
+        // Treat each element of accessor array as a separate item
+        // Create a parameter descriptor for it
+        for (size_t I = 0; I < NumElements; ++I) {
+          populateHeaderForAccessor(ET, OffsetInArray);
+          OffsetInArray += AccessorSize;


After lines 1231-1253 it's déjà vu again

In this case, there was somewhat more similarity in the actions done in the two cases, and this code has been factored out into a separate function, called twice.

…1344) Signed-off-by: Michael Kinsner <[email protected]>

elizabethandrews · 2020-03-25T19:27:43Z

From what I undestand you've handled 3 types of kernel parameters here -

Accessor arrays
Arrays of other types
Class type with an accessor array as a member.

For 1. and 3., you added each element as a separate parameter in generated OpenCL Kernel and called init functions accordingly.

For 2. You wrapped the array inside a new class 'wrapped _array' and added this as the parameter.

Is this a gist of what you did? Am I missing anything? It would be helpful having this information in the commit message.

elizabethandrews · 2020-03-25T19:36:45Z

Please add a Sema test showing AST generated.

Signed-off-by: Alexey Sachkov <[email protected]>

Enable -fdeclare-spirv-builtins for SYCL device compilation mode For device compilation, SPIR-V builtins are now looked up by the device compiler. They now longer need to be forward declared. [SYCL-PTX] Revert manual mangling of some SPIR-V builtins [SYCL-PTX] Add fmod builtin [SYCL-PTX] Update Atomic mangling Signed-off-by: Victor Lomuller <[email protected]>

…<dir> (#1346) When using /Fo<dir> the improper dependency file name was generated, causing the bundle step to not be able to locate the dependency file when compiling to object Signed-off-by: Michael D Toguchi <[email protected]>

This patch introduces the following loop attributes: - loop_coalesce: Indicates that the loop nest should be coalesced into a single loop without affecting functionality - speculated_iterations: Specifies the number of concurrent speculated iterations that will be in flight for a loop invocation - disable_loop_pipelining: Disables pipelining of the loop data path, causing the loop to be executed serially - max_interleaving: Places a maximum limit N on the number of interleaved invocations of an inner loop by an outer loop Signed-off-by: Viktoria Maksimova <[email protected]>

Fixed the buffer constructor called with a pair of iterators. The current implementation has a problem due to ambiguous spec. The buffer should never write back data unless there is a call to set_final_data(), but the current implementation does it. I corrected the spec in KhronosGroup/SYCL-Docs#76. So, now we can change the buffer implementation according to the clarified spec. The test case buffer.cpp also needed change because of this change. The user should not expect the automatic write-back of data upon destruction of buffer. Signed-off-by: Byoungro So <[email protected]> Co-authored-by: Ronan Keryell <[email protected]>

A simple library which allows to construct and serialize/deserialize a sequence of typed property sets, where each property is a <name,typed value> pair. To be used in offload tools. Signed-off-by: Konstantin S Bobrovsky <[email protected]>

The library allows to create, serialize/deserialize tables of strings, insert/delete/replace/rename columns, add rows. To be used in offload tools. Signed-off-by: Konstantin S Bobrovsky <[email protected]>

Signed-off-by: rdeodhar <[email protected]>

rdeodhar · 2020-03-27T02:14:53Z

Yes, that is the gist of the changes. I have uploaded a document that describes this change. From: elizabethandrews <[email protected]> Sent: Wednesday, March 25, 2020 12:28 PM To: intel/llvm <[email protected]> Cc: Deodhar, Rajiv <[email protected]>; Author <[email protected]> Subject: Re: [intel/llvm] [SYCL] Support for arrays as kernel parameters (#1387) From what I undestand you've handled 3 types of kernel parameters here - 1. Accessor arrays 2. Arrays of other types 3. Class type with an accessor array as a member. For 1. and 3., you added each element as a separate parameter in generated OpenCL Kernel and called init functions accordingly. For 2. You wrapped the array inside a new class 'wrapped _array' and added this as the parameter. Is this a gist of what you did? Am I missing anything? It would be helpful having this information in the commit message. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<#1387 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AFRFAQ4CTSCRSITU25ZWDJ3RJJLL3ANCNFSM4LTADDZA>.

erichkeane · 2020-03-27T02:18:30Z

From what I undestand you've handled 3 types of kernel parameters here -
1. Accessor arrays

2. Arrays of other types

3. Class type with an accessor array as a member.
For 1. and 3., you added each element as a separate parameter in generated OpenCL Kernel and called init functions accordingly.

For 2. You wrapped the array inside a new class 'wrapped _array' and added this as the parameter.

Is this a gist of what you did? Am I missing anything? It would be helpful having this information in the commit message.

Hmm.. I cannot help but wonder what happens when multiple arrays are passed? Are they in separate types? Are those types necessary? I think LLVM-IR supports array-level parameters (despite C/C++ not ), so I wonder if we can do away with the type itself to avoid the obvious trouble that comes with passing it through a struct.

rdeodhar · 2020-03-27T03:07:38Z

Design document attached.
Array_Kernel_Parameters.docx

This reverts commit d357add. Signed-off-by: Vladimir Lazarev <[email protected]>

Signed-off-by: Alexander Batashev <[email protected]>

…1359) Signed-off-by: Konstantin S Bobrovsky <[email protected]>

…for (#1348) The kernel callable being invoked from an nd_range parallel_for is accepting an id argument, while it should be nd_item. After my analysis, I found we check arguments' type for kernel_parallel_for instead of parallel_for. But that check is useless, because the compiler can still find a candidate for kernel_parallel_for with nd_range and id which is a wrong combination. In my solution, parallel_for with nd_range calls kernel_parallel_for_nd_range(...) which is only available for nd_item. Signed-off-by: Bing1 Yu <[email protected]>

Implements a few code simplification/unification for LowerWGScope. Signed-off-by: Victor Lomuller <[email protected]>

) For NVPTX target address space inference for kernel arguments and allocas is happening in the backend (NVPTXLowerArgs and NVPTXLowerAlloca passes). After frontend these pointers are in LLVM default address space 0 which is the generic address space for NVPTX target. Perform address space cast of a pointer to the shadow global variable from the local to the generic address space before replacing all usages of a byval argument. Signed-off-by: Artur Gainullin <[email protected]>

- Adds static members to sub_group class. - sub_group member functions marked deprecated, to be removed later. - SPIR-V helpers expanded to convert SYCL group to SPIR-V scope. - Add workaround for half types Signed-off-by: John Pennycook <[email protected]>

Whereas it is not possible to generate vector of bools in FE, we have to change return type for corresponding instructions in SPIRV translator to vector of bools. SPIRV translator already did this for some instructions, this patch extends this behaviour to handle more instructions.

Adding doxygen documentation to PI CUDA backend. Some code is re-ordered in the file to help sorting the doxygen. Co-Authored-By: Alexey Bader <[email protected]> Co-Authored-By: Alexander Batashev <[email protected]> Co-Authored-By: Romanov Vlad <[email protected]> Signed-off-by: Ruyman Reyes <[email protected]>

Based on https://github.com/codeplaysoftware/standards-proposals/blob/master/spec-constant/index.md * [SYCL] PI changes: 1. Add specialization constant API to the SYCL RT Plugin Interface. New PI API added: pi_result piProgramSetSpecializationConstant(pi_program prog, pi_uint32 spec_id, size_t spec_size, const void *spec_value); 2. Add property set fields to the binary image descriptor, bump PI version. This change breaks backward binary compatibility of device binary image descriptors. 3. Add convenience C++ wrappers for PI binary image hierarchy objects. * [SYCL] Support device binary properties and file tables in the offload wrapper. 1. New option - "-properties=<file>". <file> must be a property set registry file, as defined by llvm/Support/PropertySetIO.h. The wrapper will add the property sets to the binary image descriptor and the them available to the runtime. 2. New options - "-batch". With this option the only input can be a file table, as defined by llvm/Support/SimpleTable.h. Column names are a part of interface between this tool and the sycl-post-link, which produces the file table. 3. Binary image descriptor LLVM type updated to resemble changes in Plugin Interface v1.2. * [SYCL] Specialization constants support in the Front End. 1. Detect kernel lambda object captures corresponding to specialization constants and (a) don't create kernel arguments for them (b) generate specializations of the SpecConstantInfo structure into the integration header. 2. Recognize the __unique_stable_name intrinsic and replace it with a string literal uniquely identifying the type of the typename template parameter to this intrinsic. 3. FE-related changes in the runtime: - new SpecConstantInfo templated struct for type->name translation for specialization constants used by integration header - define the __sycl_fe_getStableUniqueTypeName intrinsic * [SYCL] Add specialization constant support in SYCL runtime. 1. Define SYCL API (sycl/include/CL/sycl/experimental/spec_constant.hpp) 2. Add convenience C++ wrappers for PI device binary structures and refactor runtime to use the wrappers. Get rid of custom deleters for binary images. 3. Implement SYCL spec constant APIs in program an program manager. * [SYCL] Use file-table-tform in SYCL offload processing in clang driver. Clang driver's design can't handily model (1) multiple inputs/outputs in the action graph. Because of that, for example, sycl-post-link tool is invoked twice - once to to split the code and produce multiple bitcode files, and secondly - to generate symbol files for the split modules. (2) "Clusters" of inputs/outputs, when subsets of inputs/outputs are associated and describe different aspects of the same data. Example of such clustering is the split module + its symbol file above. Clustering would require support both in the driver and the tools invoked in response to actions. This commit moves SYCL offload processing to the "file table concept." sycl-post-link instead of (1) being invoked n times, once per each output type requested (once for device split and once for symbol file generation) (2) outputting multiple file lists each listing outputs from the corresponding invocation above is now invoked once and produces single file table output. E.g. [Code|Symbols|Properties] a_0.bc|a_0.sym|a_0.props a_1.bc|a_1.sym|a_1.props This solves both problems - multiple input/output and clustering. Combined with the file-table-tform tool, this allows for efficent handling of multiple clusters of files (each represented as a row in the table file) in the clang driver infrastructure. For example, there is a real offload processing problem: step1. sycl-post-link outputs N clusters of files step2. "Code" file of each cluster resuilting from step1 ({a_0.bc, a_1.bc} in the example above) must undergo further transformations - translation to SPIRV and optional ahead-of-time compilation. step3. In each cluster resulting from step1 the "Code" file needs to be replaced with the result of step2 step4. All the clusters are processed by the ClangOffloadWrapper tool, which needs to know how files are distributed into clusters and what is the roles of each file in a cluster - whether it is "Code", "Symbol" or "Properties". To solve this, the following action graph is constructed in the clang driver: column:"Code" t1 -> [file-table-tform:extract column] -> t1a -> [for-each:] -> t1b llvm-spirv aot-comp t1 \ column:"Code" [file-table-tform:replace column] -> t2 -> [ClangOffloadWrapper] / t1b where t1b is ["Code"] and t2 is [Code|Symbols|Properties] a_0.bin a_0.bin|a_0.sym|a_0.props a_1.bin a_1.bin|a_1.sym|a_1.props Note that the graph does not change with growing number of clusters, neither it changes when more files are added to each cluster (e.g. a "Manifest" file). * [SYCL] Process specialization constants in sycl-post-link tool. Add a spec constant lowering pass to sycl-post-link tool. Support file table output format. * [SYCL] Temporarily disable spec_const_hw.cpp on CPU. CPU OpenCL Runtime on build machines is not updated yet. Signed-off-by: Konstantin S Bobrovsky <[email protected]>

Signed-off-by: rdeodhar <[email protected]>

AlexeySachkov · 2020-03-30T10:35:28Z

Seems to be moved to #1423

rdeodhar and others added 9 commits March 24, 2020 14:32

[SYCL] Support for arrays as kernel parameters

6f524c9

Signed-off-by: rdeodhar <[email protected]>

[SYCL] Support for array kernel parameters

1945657

Signed-off-by: rdeodhar <[email protected]>

Conflict resolved

462cae0

[SYCL] Default work-group sizes based on max (#952)

b48f08f

Signed-off-by: Stuart Adams <[email protected]>

[SYCL] Remove unexpected double in complex SPIR-V for float support (#…

b51bb87

…1381) Signed-off-by: gejin <[email protected]>

[SYCL] Add experimental flag to enable front-end optimizations (#1376)

4eab55e

NOTE: This flag is not exposed to the driver and not intended for users. It's added to make experiments and identify issues with optimizations. Signed-off-by: Alexey Bader <[email protected]>

[SYCL] Fixed sub-buffer alloca search (#1385)

5d1d716

If found alloca command is not sub-buffer alloca, then it's parent alloca which has same context Signed-off-by: Ivan Karachun <[email protected]>

romanovvlad assigned Fznamznon Mar 25, 2020

romanovvlad requested a review from erichkeane March 25, 2020 15:28

Fznamznon reviewed Mar 25, 2020

View reviewed changes

[SYCL][Doc] First revision of standard layout relaxation extension (#…

ce53521

…1344) Signed-off-by: Michael Kinsner <[email protected]>

AlexeySachkov and others added 8 commits March 26, 2020 10:15

[SYCL][NFC] Unify setting kernel arguments (#1379)

899600b

Signed-off-by: Alexey Sachkov <[email protected]>

[Support] Implement a simple tabular data management library (#1358)

96de11d

The library allows to create, serialize/deserialize tables of strings, insert/delete/replace/rename columns, add rows. To be used in offload tools. Signed-off-by: Konstantin S Bobrovsky <[email protected]>

[SYCL] updated kernel array parameter support

09a4b44

Signed-off-by: rdeodhar <[email protected]>

rdeodhar requested a review from elizabethandrews March 27, 2020 02:15

rdeodhar requested a review from rolandschulz March 27, 2020 03:08

Revert "[SYCL] workaround for memory consumption issue"

568e0ad

This reverts commit d357add. Signed-off-by: Vladimir Lazarev <[email protected]>

Alexander Batashev and others added 11 commits March 27, 2020 12:28

[SYCL][Doc] Deploy Sphinx documentation (#1406)

59272d8

Signed-off-by: Alexander Batashev <[email protected]>

Add a file-table-tform tool: manipulating tabular string data files. (#…

6b44ebb

…1359) Signed-off-by: Konstantin S Bobrovsky <[email protected]>

[SYCL] Simplify handing of builtins in LowerWGScope (#1399)

dc7d851

Implements a few code simplification/unification for LowerWGScope. Signed-off-by: Victor Lomuller <[email protected]>

[SYCL] Support for array kernel parameters

c0e498b

Signed-off-by: rdeodhar <[email protected]>

[SYCL] Support for array kernel parameters

a4ee1f6

Signed-off-by: rdeodhar <[email protected]>

AlexeySachkov closed this Mar 30, 2020

		@@ -0,0 +1,259 @@
		// This test checks array kernel parameters

		// This test checks if compiler reports compilation error on an attempt to pass
		// a struct with non-trivially copyable type as SYCL kernel parameter.

[SYCL] Support for arrays as kernel parameters #1387

[SYCL] Support for arrays as kernel parameters #1387

Uh oh!

Conversation

rdeodhar commented Mar 24, 2020

Uh oh!

erichkeane commented Mar 25, 2020

Uh oh!

Fznamznon left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elizabethandrews commented Mar 25, 2020

Uh oh!

elizabethandrews commented Mar 25, 2020

Uh oh!

rdeodhar commented Mar 27, 2020 via email

Uh oh!

erichkeane commented Mar 27, 2020 • edited by Fznamznon Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rdeodhar commented Mar 27, 2020

Uh oh!

AlexeySachkov commented Mar 30, 2020

Uh oh!

Uh oh!

erichkeane commented Mar 27, 2020 •

edited by Fznamznon

Loading