Skip to content

[SYCL] Add offload wrapping for SYCL kind. #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

maksimsab
Copy link
Owner

No description provided.

@@ -54,7 +54,7 @@ __attribute__((visibility("protected"), used)) int x;
// RUN: clang-offload-packager -o %t.out \
// RUN: --image=file=%t.spirv.bc,kind=sycl,triple=spirv64-unknown-unknown,arch=generic
// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o -fembed-offload-object=%t.out
// RUN: not clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --dry-run \
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asudarsa
I wonder, why has it been not'ed?

case OFK_SYCL:
return EmptyRes;
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asudarsa
Do I get it right, that we don't need any bundling for SYCL?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say that we do the same as OpenMP here. Just put each image inside a MemoryBuffer and then we can read from these MemoryBuffers inside wrapDeviceImages. We do not want to change the signature of wrapDeviceImages.

@@ -47,6 +47,7 @@ enum ImageKind : uint16_t {
IMG_Cubin,
IMG_Fatbinary,
IMG_PTX,
IMG_SPIRV,
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asudarsa.
ImageKind and OffloadKind are being encoded in the OffloadWrapper and this is going to be a part of ABI.
I see that you have changed OffloadKind to powers of 2. We need to settle some consistency here.

Copy link

@jzc jzc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, just a couple comments

Comment on lines 337 to 338
auto *PropertiesConstant = addStringToModule(
OI.StringData.lookup("properties"), "__sycl_properties");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was imagining that the the JSON properties would be deserialized here and constructed in the same way as the downstream wrapper (i.e. I don't think it's a good idea to delegate JSON decoding to SYCL runtime). Can this to nullptr for now?

return OffloadBinaryOrErr.takeError();

OffloadBinaries.emplace_back(std::move(*OffloadBinaryOrErr));
Images.emplace_back(OffloadBinaries.back()->getOffloadingImage());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asudarsa Does there need to be one more layer of unwrapping, or am I misunderstanding how the process works? My understanding is that clang-sycl-linker outputs an a list of OffloadingImages concatenated together. Then, clang-linker-wrapper takes the output from the device linker and then puts it in an OffloadingImage. So in the SYCL case, this functinon receives a list of OffloadingImages, with each of these OffloadingImages holding a list of OffloadingImages concatenated together?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. My latest change addresses that.

maksimsab pushed a commit that referenced this pull request Jun 10, 2025
Fixes llvm#123300

What is seen 
```
clang-repl> int x = 42;
clang-repl> auto capture = [&]() { return x * 2; };
In file included from <<< inputs >>>:1:
input_line_4:1:17: error: non-local lambda expression cannot have a capture-default
    1 | auto capture = [&]() { return x * 2; };
      |                 ^
zsh: segmentation fault  clang-repl --Xcc="-v"

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x8)
  * frame #0: 0x0000000107b4f8b8 libclang-cpp.19.1.dylib`clang::IncrementalParser::CleanUpPTU(clang::PartialTranslationUnit&) + 988
    frame #1: 0x0000000107b4f1b4 libclang-cpp.19.1.dylib`clang::IncrementalParser::ParseOrWrapTopLevelDecl() + 416
    frame #2: 0x0000000107b4fb94 libclang-cpp.19.1.dylib`clang::IncrementalParser::Parse(llvm::StringRef) + 612
    frame llvm#3: 0x0000000107b52fec libclang-cpp.19.1.dylib`clang::Interpreter::ParseAndExecute(llvm::StringRef, clang::Value*) + 180
    frame llvm#4: 0x0000000100003498 clang-repl`main + 3560
    frame llvm#5: 0x000000018d39a0e0 dyld`start + 2360
```

Though the error is justified, we shouldn't be interested in exiting
through a segfault in such cases.

The issue is that empty named decls weren't being taken care of
resulting into this assert


https://github.com/llvm/llvm-project/blob/c1a229252617ed58f943bf3f4698bd8204ee0f04/clang/include/clang/AST/DeclarationName.h#L503

Can also be seen when the example is attempted through xeus-cpp-lite.


![image](https://github.com/user-attachments/assets/9b0e6ead-138e-4b06-9ad9-fcb9f8d5bf6e)
maksimsab pushed a commit that referenced this pull request Jun 10, 2025
…142952)

This was removed in llvm#135343 in
favour of making it a format variable, which we do here. This follows
the precedent of the `[opt]` and `[artificial]` markers.

Before:
```
 thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.2
 * frame #0: 0x000000010000037c a.out`inlined1() at inline.cpp:4:3
   frame #1: 0x000000010000037c a.out`regular() at inline.cpp:6:17
   frame #2: 0x00000001000003b8 a.out`inlined2() at inline.cpp:7:43
   frame llvm#3: 0x00000001000003b4 a.out`main at inline.cpp:10:3
   frame llvm#4: 0x0000000186345be4 dyld`start + 7040
```

After (note the `[inlined]` markers):
```
thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.2
* frame #0: 0x000000010000037c a.out`inlined1() at inline.cpp:4:3 [inlined]
  frame #1: 0x000000010000037c a.out`regular() at inline.cpp:6:17
  frame #2: 0x00000001000003b8 a.out`inlined2() at inline.cpp:7:43 [inlined]
  frame llvm#3: 0x00000001000003b4 a.out`main at inline.cpp:10:3
  frame llvm#4: 0x0000000186345be4 dyld`start + 7040
```

rdar://152642178
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants