-
Notifications
You must be signed in to change notification settings - Fork 797
[SYCL][DOC] Initial commit of oneapi extension proposal for adding P2P #6104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 4 commits
19bc35a
96ff0cc
164f462
2eb8ff8
e3b12fc
15e306d
410d806
d28af85
0c4d39d
ab8625e
006dd20
2329d71
232ce60
549ce0d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,158 @@ | ||
= sycl_ext_oneapi_peer_access | ||
|
||
:source-highlighter: coderay | ||
:coderay-linenums-mode: table | ||
|
||
// This section needs to be after the document title. | ||
:doctype: book | ||
:toc2: | ||
:toc: left | ||
:encoding: utf-8 | ||
:lang: en | ||
:dpcpp: pass:[DPC++] | ||
|
||
// Set the default source code type in this document to C++, | ||
// for syntax highlighting purposes. This is needed because | ||
// docbook uses c++ and html5 uses cpp. | ||
:language: {basebackend@docbook:c++:cpp} | ||
|
||
|
||
== Notice | ||
|
||
[%hardbreaks] | ||
Copyright (C) 2022-2022 Intel Corporation. All rights reserved. | ||
|
||
Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks | ||
of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by | ||
permission by Khronos. | ||
|
||
|
||
== Contact | ||
|
||
To report problems with this extension, please open a new issue at: | ||
|
||
https://github.com/intel/llvm/issues | ||
|
||
|
||
== Dependencies | ||
|
||
This extension is written against the SYCL 2020 revision 5 specification. All | ||
jbrodman marked this conversation as resolved.
Show resolved
Hide resolved
|
||
references below to the "core SYCL specification" or to section numbers in the | ||
SYCL specification refer to that revision. | ||
|
||
== Status | ||
|
||
This is a proposed extension specification, intended to gather community | ||
feedback. Interfaces defined in this specification may not be implemented yet | ||
or may be in a preliminary state. The specification itself may also change in | ||
incompatible ways before it is finalized. *Shipping software products should | ||
not rely on APIs defined in this specification.* | ||
|
||
|
||
== Overview | ||
|
||
This extension adds support for mechanisms to query and enable support for | ||
memory access between peer devices in a system. | ||
In particular, this allows one device to access USM Device allocations | ||
for a peer device. This extension does not apply to USM Shared allocations. | ||
Peer to peer capabilities are useful as they can provide | ||
access to a peer device's memory inside a compute kernel and optimized memory | ||
copies between peer devices. | ||
|
||
== Specification | ||
|
||
=== Feature test macro | ||
|
||
This extension provides a feature-test macro as described in the core SYCL | ||
specification. An implementation supporting this extension must predefine the | ||
macro `SYCL_EXT_ONEAPI_PEER_ACCESS` to one of the values defined in the table | ||
below. Applications can test for the existence of this macro to determine if | ||
the implementation supports this feature, or applications can test the macro's | ||
value to determine which of the extension's features the implementation | ||
supports. | ||
|
||
[%header,cols="1,5"] | ||
|=== | ||
|Value | ||
|Description | ||
|
||
|1 | ||
|Initial version of this extension. | ||
|=== | ||
|
||
|
||
=== Peer to Peer (P2P) Memory Access APIs | ||
|
||
This extension adds support for mechanisms to query and enable support for | ||
direct memory access between peer devices in a system. | ||
In particular, this allows one device to directly access USM Device | ||
allocations for a peer device in the same context. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If two devices with P2P capabilities are placed in the same context, shouldn't this be implicitly enabled? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There has been a lot of discussion about what a context means. I think our current consensus is that it does not provide any guarantee about P2P access between devices. Therefore, placing two devices in the same context does not provide any guarantee that USM memory allocated for one of those devices is accessible from another device in that same context. See the discussion in internal Khronos issue 563. |
||
Peer to peer capabilities are useful as they can provide access to a peer | ||
device's memory inside a compute kernel and also optimized memory copies between | ||
peer devices. | ||
|
||
This extension adds the following new member functions to the device class, as described | ||
below. | ||
|
||
[source,c++] | ||
---- | ||
namespace sycl { | ||
namespace ext { | ||
namespace oneapi { | ||
enum class peer_access { | ||
access_supported, | ||
access_enabled, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oops! |
||
atomics_supported, | ||
}; | ||
} // namespace oneapi | ||
} // namespace ext | ||
|
||
class device { | ||
public: | ||
bool ext_oneapi_can_access_peer(const device &peer, | ||
ext::oneapi::peer_access value = | ||
ext::oneapi::peer_access::access_supported); | ||
void ext_oneapi_enable_peer_access(const device &peer); | ||
void ext_oneapi_disable_peer_access(const device &peer); | ||
}; | ||
|
||
} // namespace sycl | ||
---- | ||
|
||
The semantics of the new functions are: | ||
|
||
|=== | ||
|Member Function |Description | ||
|
||
|bool ext_oneapi_can_access_peer(const device &peer, | ||
ext::oneapi::peer_access value = | ||
ext::oneapi::peer_access::access_supported) | ||
a|Queries the peer access status between this device and `peer` according to | ||
the query `value`: | ||
|
||
* `ext::oneapi::peer_access::access_supported`: Returns true only if it is | ||
possible for this device to enable peer access to USM device memory allocations | ||
located on the `peer` device. | ||
|
||
* `ext::oneapi::peer_access::access_enabled`: Returns true only if peer access is | ||
currently enabled from this device to the `peer` device. | ||
|
||
* `ext::oneapi::peer_access::atomics_supported`: When this query returns true, | ||
it indicates that this device may perform atomic operations on USM device memory | ||
allocations located on the `peer` device when peer access is enabled to that | ||
device. If the query returns false, attempting to perform atomic operations on | ||
`peer` memory will have undefined behavior. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The core SYCL spec makes a distinction between "atomic operations" and "concurrent access". The Level Zero driver has separate queries for these two concepts. We need to clarify what This is an area we are debating in general, though, so we may end up making two different queries for these concepts. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think concurrent access comes into play here - I think it's only (pseudocode) atomicAdd(ptr, val) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Atomic operations only make sense if two things can access the memory concurrently. I guess there are two possible interpretations for what
I was originally thinking the query meant (1), but your comment makes me think that maybe you intend (2)? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another thing that we should pay attention to here is the concept of memory scope. If the device and If the device is only accessing I don't know whether it's better to use the atomics & concurrent distinction or to work in some concept of scope, but I agree with Greg that this needs to clarify exactly what is guaranteed. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would it help to add a new extended memory scope like |
||
|
||
|void enable_peer_access(const device &peer) | ||
|Enables this device to access USM device allocations located on the peer | ||
device. This does not permit the peer device to access this device's memory. | ||
This device must be in the same context as the allocations being accessed. | ||
gmlueck marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Throws an exception if access cannot be enabled or if access is already | ||
enabled. | ||
gmlueck marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
|void disable_peer_access(const device &peer) | ||
|Disables access to the peer device's memory from this device. Throws an | ||
exception if access cannot be disabled or if access is not enabled. | ||
gmlueck marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
|=== | ||
|
Uh oh!
There was an error while loading. Please reload this page.