-
Notifications
You must be signed in to change notification settings - Fork 795
[SYCL][ESIMD][EMU] Queue/Event handling change for esimd_cpu #4201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][ESIMD][EMU] Queue/Event handling change for esimd_cpu #4201
Conversation
- Including esimd_cpu in work-around for revised queue::wait() logic same as level-zero, introduced in PR#4044 - Returning 'false' from ExecCGCommand:: producesPiEvent() predicate for esimd_cpu as no effective event is created during esimd_cpu kernel execution
Could you please clarify why this is necessary? What is the problem with "dummy" events if all operations are completed during Tagging @sergey-semenov |
Some ESIMD samples from intel/llvm-test-suite fails without this change - e.g. vadd_1d.
'DummyEvents' are still observed by Meanwhile, I have used ESIMD samples from intel/llvm-test-suite for enabling ESIMD_CPU support. So far, none of those samples has invoked |
Still not sure I understand. Can we return "dummy" events from piEnqueue* APIs and then just ignore them in all PI APIs that take events? |
|
I think we need to find the root cause before committing the patch. |
I think I found out solution. It has to be applied on top of the change set from PR#4011. As of August/19/2021, the PR is approved, but buildbot jobs fail due to missing packages required for online-building libva which is pre-requisite for CM. As soon as PR#4011 is merged into |
Could you please tell if we can create a dummy event during esimd_cpu kernel execution? |
Dummy event is already applied for llvm/sycl/plugins/esimd_cpu/pi_esimd_cpu.cpp Line 1084 in bf7eb4c
llvm/sycl/plugins/esimd_cpu/pi_esimd_cpu.cpp Line 1214 in bf7eb4c
llvm/sycl/plugins/esimd_cpu/pi_esimd_cpu.cpp Line 1309 in bf7eb4c
|
So, why we need to return |
- Event creation is imported from pi_cuda.cpp - Special handling for esimd_cpu in ExecCGCommand::producesPiEvent() (commands.cpp) is removed
@romanovvlad, your suggestion worked with my fully working branch for ESIMD_CPU. After suggestion applied, only one file is changed. Due to dependency on PR#4430, buildbots are failing during merge. I'll merge change sets from the PR to this PR when it is merged to |
Can I apply change from this PR to PR#4430 (#4430)? As this change set is limited to one file, and by applying the change, we could prevent possible failures in PR#4430 caused by not having changes from this PR. |
Just curious - why not merge this PR first? I'm OK with uniting the two PRs. |
It still has dependency on PR#4430 (4011 previously) as it changes some PI_API implementation. |
@@ -729,7 +736,9 @@ pi_result piQueueRelease(pi_queue Queue) { | |||
} | |||
|
|||
pi_result piQueueFinish(pi_queue) { | |||
DIE_NO_IMPLEMENTATION; | |||
// TODO/FIXME : Only in-order queue is called for this API. Support |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
piQueueFinish
must guarantee that all previously enqueued commands had completed. The fact that the queue is in-order does not guarantee that. In-order queue just means enqueued commands will execute in the enqueue order. Still, they can can execute asynchronously with the thread which enqueues the commands.
As far as I understand your current implementation, it is safe for piQueueFinish
to be no-op just because all the commands like kernel execution or memory copy are blocking - enqueue of any command with ESIMD EMU plugin does not return until the command is actually finished.
If so, please change the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the comment.
std::unique_ptr<_pi_event> RetEv{nullptr}; | ||
if (Event) { | ||
RetEv = std::unique_ptr<_pi_event>(new _pi_event()); | ||
RetEv->IsDummyEvent = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand your current implementation, with all commands being blocking, all events returned from enqueue operations are effectively in "completed" state once control is returned back to the caller of the enqueue operation.
- Given that I don't see the need in IsDummyEvent field.
- Second, the above code is not needed, instead, the line 1218 should be '*Event = new _pi_event();' plus maybe setting the event to 'completed' state.
The same applies to other places where events are created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@romanovvlad suggested this implementation from CUDA PI implementation from recent phone call. Also, he was fine with this 'DummyEvent' approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@romanovvlad suggested this implementation from CUDA PI implementation from recent phone call.
OK, now I see the point - unique_ptr is useful to automatically release the memory allocated by new _pi_even in case buf->CmBufferPtr->ReadSurface throws an exception. Disregard this my comment then.
Also, he was fine with this 'DummyEvent' approach.
DummyEvent
seems is redundant, as I explained above, even though harmless. So I don't mind leaving it as is for now.
As PR#4430 is almost complete without change from this PR, I'll keep this active and merge later. |
…eue_event_handling
Uh oh!
There was an error while loading. Please reload this page.