-
Notifications
You must be signed in to change notification settings - Fork 795
[SYCL][Graph] async_malloc use allocation size for zeVirtualMemQueryPageSize #19402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][Graph] async_malloc use allocation size for zeVirtualMemQueryPageSize #19402
Conversation
0a4f4c7
to
f9a4616
Compare
f9a4616
to
4befbf0
Compare
4befbf0
to
9475527
Compare
sycl/test-e2e/Graph/AsyncAlloc/Inputs/async_alloc_different_sizes.cpp
Outdated
Show resolved
Hide resolved
482ce66
to
2b622d3
Compare
2b622d3
to
889a26a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks
sycl/test-e2e/Graph/AsyncAlloc/Inputs/async_alloc_different_sizes.cpp
Outdated
Show resolved
Hide resolved
sycl/test-e2e/Graph/AsyncAlloc/Explicit/async_alloc_different_sizes.cpp
Outdated
Show resolved
Hide resolved
af670a0
to
6dee5d6
Compare
@intel/llvm-reviewers-runtime ping |
sycl/source/detail/virtual_mem.hpp
Outdated
size_t get_mem_granularity_for_allocation_size(const device &SyclDevice, | ||
const context &SyclContext, | ||
granularity_mode Mode, | ||
size_t AllocationSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the one hand, I'd like us to move to a future where sycl::<object>
aren't used inside source/detail
and instead code there operates on <object>_impl
s only. On the other hand, implementation isn't under source/detail
(why?) and get_ur_handles
is designed to work on sycl::<object>
so changing signature would probably make the code uglier.
As such, I think the "why?" above is what really matters here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not feel competent enough to answer. I've kept implementation in the place where it existed so far. I see no clear criteria why some .cpp files are in sycl/source and other in sycl/source/detail
@aelovikov-intel , do you want us to work on this further in this PR or can we merge without going deep into this? If we can merge, do you have permissions to do so or shall I ask someone else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, let's change this new overload to accept device_impl &
and context_impl &
. You'd need to add another overload to helpers at
llvm/sycl/source/detail/context_impl.hpp
Lines 364 to 384 in 004f38e
// We're under sycl/source and these won't be exported but it's way more | |
// convenient to be able to reference them without extra `detail::`. | |
inline auto get_ur_handles(sycl::detail::context_impl &Ctx) { | |
ur_context_handle_t urCtx = Ctx.getHandleRef(); | |
return std::tuple{urCtx, &Ctx.getAdapter()}; | |
} | |
inline auto get_ur_handles(const sycl::context &syclContext) { | |
return get_ur_handles(*sycl::detail::getSyclObjImpl(syclContext)); | |
} | |
inline auto get_ur_handles(const sycl::device &syclDevice, | |
const sycl::context &syclContext) { | |
auto [urCtx, Adapter] = get_ur_handles(syclContext); | |
ur_device_handle_t urDevice = | |
sycl::detail::getSyclObjImpl(syclDevice)->getHandleRef(); | |
return std::tuple{urDevice, urCtx, Adapter}; | |
} | |
inline auto get_ur_handles(const sycl::device &syclDevice) { | |
auto &implDevice = *sycl::detail::getSyclObjImpl(syclDevice); | |
ur_device_handle_t urDevice = implDevice.getHandleRef(); | |
return std::tuple{urDevice, &implDevice.getAdapter()}; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I've applied. Please check the last commit if this is what you wanted.
@intel/llvm-gatekeepers , could you please merge? There are all approvals and the only one failed job is validate_email, which is due to faulty CI change from yesterday (I've checked with @uditagarwal97 - author of the checker change that there is no problem with my email). Please merge. |
In L0 we need to call zeVirtualMemQueryPageSize with the actual allocation size for the virtual/physical allocations to align correctly.
Right now we check alignment without passing any size: https://github.com/intel/llvm/blob/sycl/sycl/source/detail/graph/memory_pool.cpp#L45
This ends up translating to 1 byte in the call to L0: https://github.com/oneapi-src/unified-runtime/blob/de05f984aa19458a4993d2a2709e3b79d82f1a37/source/adapters/level_zero/virtual_mem.cpp#L32-L37 and for large allocations a wrong alignment is used and L0 reports ZE_RESULT_ERROR_UNSUPPORTED_SIZE upon zePhysicalMemCreate call (UR fails with UR_RESULT_ERROR_INVALID_VALUE then).
The UR API should change to accept a size.
This PR exposes this issue in a unittest and fixes it.