-
Notifications
You must be signed in to change notification settings - Fork 125
[UR][L0] Add support for passing device list to urProgramBuild/Link/Compile #934
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[UR][L0] Add support for passing device list to urProgramBuild/Link/Compile #934
Conversation
…Link/Compile pre-commit check of oneapi-src/unified-runtime#934 piProgramBuild receives a list of devices, while urProgramBuild does not. This produces a series of issues when a UR program needs to be created for a specific device. So define a new API, called urProgramBuildExp to pass this list. Signed-off-by: Spruit, Neil R <[email protected]>
6e8d894
to
5f8335c
Compare
5f8335c
to
272406e
Compare
thanks @nrspruit . Changes look good. Please check failures. |
see #924 (comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CUDA/HIP 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll need to bring in the spec changes for this first, then add these entry points to the loader dispatch tables.
cca8e98
to
ca1829b
Compare
Hello @kbenzie , With #924 (comment) 924 merged, we need to merge this change, then update the PR intel/llvm#11464 with the updated unified runtime version to merge the fix to the latest version. |
ze_context_handle_t ZeContext = Program->Context->ZeContext; | ||
std::ignore = Context; | ||
std::ignore = numDevices; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd remove Context
and numDevices
in function definition, i.e. to have argument type but not argument name. If it's not used in the function body.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @mmoadeli . I would prefer to leave them. That way, urProgramBuild
and urProgramBuildExp
are as compatible as possible, allowing us to see if in the future we can merge the two into a single API, or find other use cases.
For instance, if in urProgramBuildExp
numDevices is 0, then the program could be built for all devices in the Context instead, which is kind of falling back again to urProgramBuild.
what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @mmoadeli . I would prefer to leave them. That way,
urProgramBuild
andurProgramBuildExp
are as compatible as possible, allowing us to see if in the future we can merge the two into a single API, or find other use cases.For instance, if in
urProgramBuildExp
numDevices is 0, then the program could be built for all devices in the Context instead, which is kind of falling back again to urProgramBuild.
what do you think?
I did not mean to change the API, only changing it to below in this case, to have the type for consistency and not to have the name to avoid using std::ignore
.
UR_APIEXPORT ur_result_t UR_APICALL urProgramBuildExp(
ur_context_handle_t, ///< [in] handle of the context instance.
ur_program_handle_t Program, ///< [in] Handle of the program to build.
uint32_t, ur_device_handle_t *phDevices,
const char *Options ///< [in][optional] pointer to build options
///< null-terminated string.
)
The reason this can't be merged yet is because the spec changes to add these entry point don't exist in the adapters branch. We'd like to get those changes into the adapters branch with #1072 but it depends on #1014 and #1016. Both of these are pulling in lots of extra stuff which is prooving to be a real pain to integrate, especially the Windows E2E test failures in intel/llvm#11771 (to pull in the changes in #1014). Any help you can provide getting these to merge faster will mean this can be merged sooner. |
Since we've had trouble with getting the spec changes from main into the adapters branch we are going with plan B which is to cherry-pick the spec changes needed for this PR to the adapters branch in #1083 instead. |
I'm going to pull #1083 into this branch and also add another commit to include these entry points in the UR loader DDI table. Having these in one place will make it easier to pull into intel/llvm once we get a chance to do so (currently there are issues in CI). FYI this will involve force pushing to your branch. |
piProgramBuild receives a list of devices, while urProgramBuild does not. This produces a series of issues when a UR program needs to be created for a specific device. So define a new API, called urProgramBuildExp to pass this list. Authored-by: [email protected]
Expand upon the introduction of `urProgramBuildExp` and include `urProgramCompileExp` and `urProgramLinkExp` which include a device-list in place of a context. These more closely align with the PI/OpenCL analogues but only to introduce device-lists, not all extant arguments from those entry-points. This patch also moves the `urProgramBuildExp` definition into an experimental feature file and introduces a brief document containing motivation.
…ompile piProgramBuild receives a list of devices, while urProgramBuild does not. This produces a series of issues when a UR program needs to be created for a specific device. So define a new API, called urProgramBuildExp to pass this list. Signed-off-by: Spruit, Neil R <[email protected]> Co-authored-by: Jaime Arteaga <[email protected]>
Add the `urProgramBuildExp`, `urProgramCompileExp`, and `urProgramLinkExp` to the loader `ur_program_exp_dditable_t`. Also add `"ur_exp_multi_device_compile"` to the list of extensions supported by the L0 adapter, enables the SYCL RT to query support.
11ad98c
to
26b9829
Compare
Align the `urProgramLinkExp` spec and implementation argument orders to fix Windows link error.
So I'm learning that these implementations have different arguments to the entry-points defined in the spec. |
@nrspruit : could you check? |
0f29db7
to
9c17051
Compare
Apologies, yes the implementation was for the earlier version of the API, I have fixed it now and the build is passing. Please confirm that I did not miss anything. |
@@ -196,6 +196,9 @@ class ur_function_v(IntEnum): | |||
ADAPTER_RETAIN = 179 ## Enumerator for ::urAdapterRetain | |||
ADAPTER_GET_LAST_ERROR = 180 ## Enumerator for ::urAdapterGetLastError | |||
ADAPTER_GET_INFO = 181 ## Enumerator for ::urAdapterGetInfo | |||
PROGRAM_BUILD_EXP = 197 ## Enumerator for ::urProgramBuildExp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dont we have all the changes in spec already merged in #924? I would expect this PR to only need to modify code in the L0 adapter, as the spec part has been already taken care of. Or is that this is not rebased?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#924 was merged to main not adapters, we created #1083 to cherry-pick the spec changes into adapters since we were having trouble syncing adapters and main. I then made the decision to take the cherry-pick from #1083 into this PR to reduce the number of intel/llvm PR's required to get this feature merged in time for the next release.
9c17051
to
20e8ee6
Compare
…nition to match spec Signed-off-by: Spruit, Neil R <[email protected]>
20e8ee6
to
0790bf8
Compare
…ompile piProgramBuild receives a list of devices, while urProgramBuild does not. This produces a series of issues when a UR program needs to be created for a specific device. So define a new API, called urProgramBuildExp to pass this list. Requires related patch in Unified Runtime Adapters here: oneapi-src/unified-runtime#934 Signed-off-by: Spruit, Neil R <[email protected]>
…ompile (#11464) piProgramBuild receives a list of devices, while urProgramBuild does not. This produces a series of issues when a UR program needs to be created for a specific device. So define a new API, called urProgramBuildExp to pass this list. Requires related patch in Unified Runtime Adapters here: oneapi-src/unified-runtime#934 --------- Signed-off-by: Spruit, Neil R <[email protected]> Co-authored-by: Kenneth Benzie (Benie) <[email protected]>
…ompile (intel#11464) piProgramBuild receives a list of devices, while urProgramBuild does not. This produces a series of issues when a UR program needs to be created for a specific device. So define a new API, called urProgramBuildExp to pass this list. Requires related patch in Unified Runtime Adapters here: oneapi-src/unified-runtime#934 --------- Signed-off-by: Spruit, Neil R <[email protected]> Co-authored-by: Kenneth Benzie (Benie) <[email protected]>
piProgramBuild receives a list of devices, while urProgramBuild does not. This produces a series of issues when a UR program needs to be created for a specific device.
So define a new API, called urProgramBuildExp to pass this list.