Skip to content

[SYCL][ESIMD][EMU] Queue/Event handling change for esimd_cpu #4201

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Sep 3, 2021
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 45 additions & 29 deletions sycl/plugins/esimd_cpu/pi_esimd_cpu.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -351,7 +351,8 @@ extern "C" {
std::cerr << "Warning : Not Implemented : " << __FUNCTION__ \
<< " - File : " << __FILE__; \
std::cerr << " / Line : " << __LINE__ << std::endl; \
}
} \
return PI_SUCCESS;

pi_result piPlatformsGet(pi_uint32 NumEntries, pi_platform *Platforms,
pi_uint32 *NumPlatforms) {
Expand Down Expand Up @@ -684,6 +685,12 @@ pi_result piContextRelease(pi_context Context) {

pi_result piQueueCreate(pi_context Context, pi_device Device,
pi_queue_properties Properties, pi_queue *Queue) {
if (Properties & PI_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE) {
// TODO : Support Out-of-order Queue
*Queue = nullptr;
return PI_INVALID_QUEUE_PROPERTIES;
}

cm_support::CmQueue *CmQueue;

int Result = Context->Device->CmDevicePtr->CreateQueue(CmQueue);
Expand Down Expand Up @@ -729,7 +736,10 @@ pi_result piQueueRelease(pi_queue Queue) {
}

pi_result piQueueFinish(pi_queue) {
DIE_NO_IMPLEMENTATION;
// No-op as enqueued commands with ESIMD_CPU plugin are blocking
// ones that do not return until their completion - kernel execution
// and memory read.
CONTINUE_NO_IMPLEMENTATION;
}

pi_result piextQueueGetNativeHandle(pi_queue, pi_native_handle *) {
Expand Down Expand Up @@ -1190,6 +1200,12 @@ pi_result piEnqueueMemBufferRead(pi_queue Queue, pi_mem Src,

_pi_buffer *buf = static_cast<_pi_buffer *>(Src);

std::unique_ptr<_pi_event> RetEv{nullptr};
if (Event) {
RetEv = std::unique_ptr<_pi_event>(new _pi_event());
RetEv->IsDummyEvent = true;
Copy link
Contributor

@kbobrovs kbobrovs Aug 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand your current implementation, with all commands being blocking, all events returned from enqueue operations are effectively in "completed" state once control is returned back to the caller of the enqueue operation.

  • Given that I don't see the need in IsDummyEvent field.
  • Second, the above code is not needed, instead, the line 1218 should be '*Event = new _pi_event();' plus maybe setting the event to 'completed' state.

The same applies to other places where events are created.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@romanovvlad suggested this implementation from CUDA PI implementation from recent phone call. Also, he was fine with this 'DummyEvent' approach.

Copy link
Contributor

@kbobrovs kbobrovs Sep 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@romanovvlad suggested this implementation from CUDA PI implementation from recent phone call.

OK, now I see the point - unique_ptr is useful to automatically release the memory allocated by new _pi_even in case buf->CmBufferPtr->ReadSurface throws an exception. Disregard this my comment then.

Also, he was fine with this 'DummyEvent' approach.

DummyEvent seems is redundant, as I explained above, even though harmless. So I don't mind leaving it as is for now.

}

int Status =
buf->CmBufferPtr->ReadSurface(reinterpret_cast<unsigned char *>(Dst),
nullptr, // event
Expand All @@ -1200,18 +1216,7 @@ pi_result piEnqueueMemBufferRead(pi_queue Queue, pi_mem Src,
}

if (Event) {
try {
*Event = new _pi_event();
} catch (const std::bad_alloc &) {
return PI_OUT_OF_HOST_MEMORY;
} catch (...) {
return PI_ERROR_UNKNOWN;
}

// At this point, CM already completed buffer-read (ReadSurface)
// operation. Therefore, 'event' corresponding to this operation
// is marked as dummy one and ignored during events-waiting.
(*Event)->IsDummyEvent = true;
*Event = RetEv.release();
}

return PI_SUCCESS;
Expand Down Expand Up @@ -1286,6 +1291,14 @@ pi_result piEnqueueMemImageRead(pi_queue CommandQueue, pi_mem Image,
assert(false && "ESIMD_CPU does not support Blocking Read");
}
_pi_image *PiImg = static_cast<_pi_image *>(Image);

std::unique_ptr<_pi_event> RetEv{nullptr};

if (Event) {
RetEv = std::unique_ptr<_pi_event>(new _pi_event());
RetEv->IsDummyEvent = true;
}

int Status =
PiImg->CmSurfacePtr->ReadSurface(reinterpret_cast<unsigned char *>(Ptr),
nullptr, // event
Expand All @@ -1295,18 +1308,7 @@ pi_result piEnqueueMemImageRead(pi_queue CommandQueue, pi_mem Image,
}

if (Event) {
try {
*Event = new _pi_event();
} catch (const std::bad_alloc &) {
return PI_OUT_OF_HOST_MEMORY;
} catch (...) {
return PI_ERROR_UNKNOWN;
}

// At this point, CM already completed image-read (ReadSurface)
// operation. Therefore, 'event' corresponding to this operation
// is marked as dummy one and ignored during events-waiting.
(*Event)->IsDummyEvent = true;
*Event = RetEv.release();
}
return PI_SUCCESS;
}
Expand Down Expand Up @@ -1360,25 +1362,39 @@ piEnqueueKernelLaunch(pi_queue Queue, pi_kernel Kernel, pi_uint32 WorkDim,
}
}

std::unique_ptr<_pi_event> RetEv{nullptr};

if (Event) {
RetEv = std::unique_ptr<_pi_event>(new _pi_event());
RetEv->IsDummyEvent = true;
}

switch (WorkDim) {
case 1:
InvokeImpl<1>::invoke(Kernel, GlobalWorkOffset, GlobalWorkSize,
LocalWorkSize);
return PI_SUCCESS;
break;

case 2:
InvokeImpl<2>::invoke(Kernel, GlobalWorkOffset, GlobalWorkSize,
LocalWorkSize);
return PI_SUCCESS;
break;

case 3:
InvokeImpl<3>::invoke(Kernel, GlobalWorkOffset, GlobalWorkSize,
LocalWorkSize);
return PI_SUCCESS;
break;

default:
DIE_NO_IMPLEMENTATION;
break;
}

if (Event) {
*Event = RetEv.release();
}

return PI_SUCCESS;
}

pi_result piextKernelCreateWithNativeHandle(pi_native_handle, pi_context, bool,
Expand Down