diff --git a/sycl/doc/extensions/ExtensionMechanism/SYCL_INTEL_extension_api.asciidoc b/sycl/doc/extensions/ExtensionMechanism/SYCL_INTEL_extension_api.asciidoc new file mode 100644 index 000000000000..efd631502731 --- /dev/null +++ b/sycl/doc/extensions/ExtensionMechanism/SYCL_INTEL_extension_api.asciidoc @@ -0,0 +1,620 @@ += SYCL_INTEL_extension_api +:source-highlighter: coderay +:coderay-linenums-mode: table + +// This section needs to be after the document title. +:doctype: book +:toc2: +:toc: left +:encoding: utf-8 +:lang: en + +:blank: pass:[ +] + +// Set the default source code type in this document to C++, +// for syntax highlighting purposes. This is needed because +// docbook uses c++ and html5 uses cpp. +:language: {basebackend@docbook:c++:cpp} + +== Introduction +IMPORTANT: This specification is a draft. + +NOTE: Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are +trademarks of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. +used by permission by Khronos. + +NOTE: This document is better viewed when rendered as html with asciidoctor. +GitHub does not render image icons. + +This document describes a formal way for the Khronos organization and vendors +to add extensions to the SYCL language. + + +== Name Strings + ++SYCL_INTEL_extension_api+ + +== Notice + +Copyright (c) 2020 Intel Corporation. All rights reserved. + +== Status + +Working Draft + +This is a preview extension specification, intended to provide early access to +a feature for review and community feedback. When the feature matures, this +specification may be released as a formal extension. + +Because the interfaces defined by this specification are not final and are +subject to change they are not intended to be used by shipping software +products. + +== Version + +Built On: {docdate} + +Revision: 1 + +== Contact + +Greg Lueck, Intel (gregory 'dot' m 'dot' lueck 'at' intel 'dot' com) + +== Dependencies + +This extension is written against the SYCL 1.2.1 specification revision 6, +and it assumes some changes from the +https://github.com/KhronosGroup/SYCL-Shared/blob/master/proposals/sycl_generalization.md[ +Generalization of the SYCL specification] proposal (such as the new `sycl` +namespace. However, see the section below for some suggested changes to that +proposal. + + +== Description and Motivation + +This proposal attempts to solve several related problems. The SYCL standard +needs a defined way for the Khronos SYCL group to add extensions that are not +part of the core specification (aka KHRs). A vendor would not need to +implement such an extension to be compliant with the SYCL specification. +Vendors also need a way to add their own extensions. In addition, we need a +way to define extensions (or features in general) that are implemented only +on certain devices. Finally, application developers need a way to test whether +an extension is available, and they need a way to safely use extensions that +are device-specific. + +Vendors may choose to define extensions in order to expose custom features or +to gather feedback on an API that is not yet ready for inclusion in the core +specification. Since the APIs for extensions may change as feedback is +gathered, the extension mechanism includes a way for application developers to +test for the API version of each extension. Once a vendor extension has +stabilized, vendors are encouraged to promote it to a future version of the +core SYCL specification. Thus, extensions can be viewed as a pipeline of +features for consideration in future SYCL versions. + +=== Documentation for Extensions + +This proposal describes the _mechanism_ for releasing extensions. It does not +describe any particular extension. The proposal for the mechanism is expected +to be included as part of the core SYCL specification, however separate +documents will govern the documentation of specific extensions. + +If the Khronos SYCL group ratifies an extension, it will be published in a +document that is separate from the core SYCL specification, allowing more +frequent updates to extensions than to the core specification. Vendors are +responsible for documenting their own extensions, and they can do so without +coordination with the Khronos SYCL group. + +=== Definition of an Extension + +An extension can be implemented by adding new types or free functions in a +specific namespace, by adding functionality to an existing class that is +defined in the core SYCL specification, or through a combination of the two. + +New types or free functions for Khronos ratified extensions are defined in the +namespace `sycl::khr::`. For example, `sycl::khr::fancy` could +be the namespace for a Khronos extension named "fancy" + +If a vendor specific extension adds new types or free functions, the vendor is +encouraged to define them in the namespace `sycl::ext::` and they +are encouraged to add another namespace layer according to the name of the +extension. For example, `sycl::ext::acme::fancier` could be the namespace for +an extension from the Acme vendor. However, vendors may also choose to define +new types and free functions in another top-level namespace that is outside of +`sycl`. This might be more appropriate, for example, when an extension +integrates features from an existing non-SYCL API. A vendor may not define new +types or free functions underneath `sycl`, unless they are in +`sycl::ext::`. + +[NOTE] +==== +Vendors are discouraged from defining top level namespaces that start with the +word "sycl" because we believe that application developers may want to use +namespaces like this as namespace aliases. +==== + +Extensions may only add functionality to existing SYCL classes in a limited +way. When a Khronos ratified extension needs to add functionality to an +existing class, it does so by adding a method named `khr()` to that class. For +example, an extension on the `device` class would add a method like this: + +[source,c++,NoName,linenums] +---- +class device { + // ... + sycl::khr::device khr(); +}; +---- + +The `khr()` method returns an object, and that object provides methods that +are part of the extension. + +Likewise, a vendor specific extension may add functionality to an existing +SYCL class by adding a method named `ext_()` (e.g. `ext_acme()` +for the Acme vendor) like this: + +[source,c++,NoName,linenums] +---- +class device { + // ... + sycl::ext::acme::device ext_acme(); +}; +---- + +One motivation for this pattern is to reduce verbosity of application code +that uses an extension and to facilitate application migration when an +extension is promoted to the core SYCL specification. Consider the following +application code: + +[source,c++,NoName,linenums] +---- +void foo(sycl::device dev) { + dev.ext_acme().fancy(); +}; +---- + +If the extension "fancy" is later promoted to the core SYCL specification, the +application need only remove the call to `ext_acme()` in order to migrate the +application. + +If an extension requires {cpp} attributes, those attributes must be defined in an +attribute namespace that corresponds to the regular namespace rules described +above. Thus, the attribute namespace `sycl` is reserved for {cpp} attributes +used in the core SYCL specification, `sycl::khr` is reserved for Khronos +ratified extensions, and `sycl::ext::` is reserved for vendor +extensions. + +Applications must include a special header file in order to get declarations +for the types and free functions of an extension. Each Khronos ratified +extension has an associated header named `SYCL/khr/.hpp`. + +The include path `SYCL/ext/` is reserved for vendor extensions. +Vendors can choose to provide a single header for all extensions or to provide +separate headers for each extension. For example, the Acme vendor could +provide the header `SYCL/ext/acme/extensions.hpp` for access to all of its +extensions. As with namespaces, vendors are encouraged to define header files +in `SYCL/ext/`, but a vendor may also define header files in +another file system path that is outside of the `SYCL` directory. Vendors may +not define header files in the `SYCL` path unless they are underneath +`SYCL/ext/`. + +=== Predefined Macros + +Each Khronos ratified extension has a corresponding feature test macro of the +form `SYCL_KHR_` whose value follows the {cpp}20 +https://isocpp.org/std/standing-documents/sd-6-sg10-feature-test-recommendations[ +language feature test macros] specification, where the value is a number with +6 decimal digits (YYYYMM) identifying the year and month the extension was +first adopted or the date the extension was last updated. A vendor must +predefine this macro only if it implements the extension, so applications can +use the macro in order to determine if it is available. + +If an implementation provides a vendor specific extension, it should also +predefine a feature test macro of the form +`SYCL_EXT__` (e.g. `SYCL_EXT_ACME_FANCY`). The +value of the macro must be an integer that monotonically increases for each +version of the extension, and vendors are encouraged to use the same YYYYMM +format described above. + +Implementations that provide a vendor specific extension must also predefine a +macro of the form `SYCL_EXT_` (e.g. `SYCL_EXT_ACME`), which +applications can use to determine whether they are being compiled by that +vendor's toolchain. + +=== Global vs. Device Specific Extensions + +Extensions fall into one of two categories: global or device specific. If a +global extension is implemented, it is available regardless of which devices +the application runs on. These include extensions that are available in host +code as well as extensions that are available across all devices. Applications +can test for the presence of a global extension at compile time by using its +feature test macro. + +Device specific extensions can only be used inside kernels and can only be used +if the kernel runs on a device that supports them. There are two query +mechanisms for device specific extensions: a compile time query and a run time +query. Applications can test whether an extension is "active" at compile time +by using its feature test macro. A device specific extension is active if the +compiled application could run on at least one device that provides it. The +set of active extensions may change depending on how the compiler is invoked. +For example, if the compiler is invoked with flags that disable code generation +for FPGA devices, then extensions specific to FPGA devices would not be active +and their feature test macros would not be defined. + +If an application kernel uses a device specific extension, the application must +also perform a run time query to ensure that the kernel is only submitted to a +device that supports the extension. The mechanism for doing this query is +defined below. If an application incorrectly submits a kernel to a device that +does not support an extension it uses, the implementation must raise an error. + +Note that the feature test macros are defined uniformly for all code in the +SYCL application. If an implementation uses SMCP (single-source multiple +compiler-passes), all compiler-passes must define the feature test macros the +same way. Thus, feature test macros do not provide a way for applications to +determine which features are supported on a device. Instead, applications must +use the run time query mechanism that is defined below. + +=== Device Aspects + +Applications don't directly query a device asking if it supports a particular +extension. Instead, applications query whether a device supports a particular +"aspect". The documentation for each device specific extension tells which +device aspect(s) are required, so an application can use the device aspects to +tell whether a device supports a certain extension. + +Various relationships between device aspects and extensions are possible. For +example, a single aspect like "fpga" could enable a whole set of extensions. +In other cases, there could be a 1:1 relationship between an extension and an +aspect, in which case the implementation could use the same name for both the +aspect and the extension (e.g. the aspect name "fancy" could correspond to an +extension that is also named "fancy"). Finally, some extensions could be +enabled with any of several device aspects. For example, an extension named +"fancier" could be supported on devices with aspect "cpu" or "host", but not on +other devices. + +Device aspects are defined as enumerated constants of type `sycl::aspect`. +Aspects can be defined in the core SYCL specification, in Khronos ratified +extensions, or in vendor specific extensions. Aspect names defined in +extensions are contained in the extension's namespace, however all aspects +have the same `sycl::aspect` type. Therefore, every aspect enumerated value in +an implementation must be unique. + +Core SYCL aspects are defined using a scoped enumeration like this, which also +defines the type `sycl::aspect`: + +[source,c++,NoName,linenums] +---- +namespace sycl { + +enum class aspect { + host, + cpu, + gpu + // etc. +}; + +} // namespace sycl +---- + +If a Khronos ratified extension defines an aspect, it is contained in the +`sycl::khr::aspect` namespace. One possible implementation is like so: + +[source,c++,NoName,linenums] +---- +namespace sycl { +namespace khr { +namespace aspect { + +static constexpr sycl::aspect foo = static_cast(1000); + +} // namespace aspect +} // namespace khr +} // namespace sycl +---- + +Likewise, if a vendor extension needs to define device aspects, it should +define them in the `sycl::ext::::aspect` namespace. A similar +implementation is possible: + +[source,c++,NoName,linenums] +---- +namespace sycl { +namespace ext { +namespace acme { +namespace aspect { + +static constexpr sycl::aspect bar = static_cast(-1); + +} // namespace aspect +} // namespace acme +} // namespace ext +} // namespace sycl +---- + +In the examples above, the vendor has decided to implement aspects from Khronos +ratified extensions starting at 1000 and to implement vendor specific aspects +as negative integers. However, these are implementation details. The SYCL +specification does not prescribe the numerical value of any aspect. + +=== Runtime Device Aspect Query + +Applications can query a device's aspects from host code at run time with the +new `has()` method on the `device` and `platform` classes: + +[source,c++,NoName,linenums] +---- +class device { + // ... + bool has(aspect asp) const; +}; + +class platform { + // ... + bool has(aspect asp) const; +}; +---- + +As with the old `has_extension()` method, `platform::has()` only returns true +if all devices in the platform have the given aspect. + +Applications can also get a list of all aspects supported by a device with +the new device information descriptor `info::device::aspects`: + +.Device Information Descriptor +|=== +|Device descriptor |Return type |Description + +|`info::device::aspects` +|`vector_class` +| Returns a `vector_class` of aspects supported by this SYCL device. + +|=== + +=== Device Aspect Traits + +Applications can also query device aspects at compile time via the +`sycl::is_aspect_active` trait. Performing such a query cannot tell you +anything about a specific device, so applications must still use the run time +query. Rather, the `sycl::is_aspect_active` trait tells whether any device in +the current compilation environment has the given aspect. As with the device +specific feature test macros, the set of active device aspects could be +affected by the way the compiler is invoked. For example, if the compiler is +invoked with flags that disable code generation for FPGA devices, aspects +specific to FPGA devices would no longer be active. + +In some ways, the feature test macros provide similar information to the device +aspect traits. We provide them both because the traits can be used in some +contexts where feature test macros are not as convenient (e.g. template +specialization). + +Following is one possible implementation for the device aspect traits. Notice +how the trait definitions can be split across headers so that the traits are +defined in the same header as their associated aspects. In the example below, +we assume that the compiler driver defines internal macros `_SYCL_CPU` and +`_SYCL_GPU` only when support for those devices is active. + +[source,c++,NoName,linenums] +---- +// In "SYCL/sycl.hpp" +namespace sycl { + +enum class aspect { + host, + cpu, + gpu + // etc. +}; + +template struct is_aspect_active : std::false_type {}; +template<> struct is_aspect_active : std::true_type {}; + +#ifdef _SYCL_CPU +template<> struct is_aspect_active : std::true_type {}; +#endif + +#ifdef _SYCL_GPU +template<> struct is_aspect_active : std::true_type {}; +#endif + +} // namespace sycl +---- + +[source,c++,NoName,linenums] +---- +// In "SYCL/ext/acme/extensions.hpp" +namespace sycl { +namespace ext { +namespace acme { +namespace aspect { + +static constexpr sycl::aspect bar = static_cast(-1); + +} // namespace aspect +} // namespace acme +} // namespace ext + +template<> struct is_aspect_active : std::true_type {}; + +} // namespace sycl +---- + +The trait `sycl::is_aspect_active::value` must be defined as either true +or false for all core SYCL aspects, all aspects from Khronos ratified +extensions, and all of a vendor's own extension aspects. + +Note also that the `aspect` traits must be defined uniformly for all code in +the SYCL application. If an implementation uses SMCP, all compiler-passes must +define the `aspect` traits the same way. Thus, the traits do not provide a way +for applications to determine which features are supported on a device. +Instead, applications must use the run time query mechanism for this. + +=== List of Core SYCL Aspects + +The following table lists the standard aspects that are defined in the core +SYCL specification. + +.Core SYCL Aspects +|=== +|Aspect Name |Meaning + +|`host` +|Tells if a device is the host device. + +|`cpu` +|Tells if a device is a CPU device. + +|`gpu` +|Tells if a device is a GPU device. + +|`accelerator` +|Tells if a device is an accelerator device. + +|=== + +=== Device Specific Extensions and Compilation Errors + +SYCL applications are allowed to contain kernels for disparate devices and +those kernels, of course, are allowed to use device specific extensions. +Applications are responsible for ensuring that kernels using such an extension +are never submitted to a device that does not support the extension and never +compiled for a device that does not support the extension (e.g. via +`program::compile_with_kernel_type()` or `program::build_with_kernel_type()`). +If an application fails to adhere to this requirement, the implementation is +required to report an error. + +[NOTE] +==== +If a vendor defines a compiler flag that causes some kernels to be pre-compiled +for some devices, the vendor is responsible for defining the semantics about +when errors are reported for kernels that use device specific extensions. +==== + +An implementation may not raise a spurious error as a result of speculative +compilation of a kernel for a device when the application did not specifically +ask to submit the kernel to that device or to compile that kernel for that +device. To clarify, consider the following example. An application with +kernels K1 and K2 runs on devices D1 and D2. Kernel K1 uses extensions +specific to D1, and kernel K2 uses extensions specific to D2. The application +is coded to ensure that K1 is only submitted to D1 and that K2 is only +submitted to D2. An implementation may not raise errors due to speculative +compilation of K1 for device D2 or for compilation of K2 for device D1. + +An implementation is allowed, however, to raise compilation errors for a kernel +that is not valid for any device. Therefore an implementation may raise +compilation errors for a kernel K that is invalid for all devices, even if the +application is coded such that kernel K is never submitted to any device. + +=== Redundant and Deprecated APIs + +With the addition of the `aspect` concept, the following SYCL 1.2.1 APIs are +now redundant, and we will consider deprecating them. + +* `platform::has_extension()` +* `device::has_extension()` +* `device::is_cpu()` +* `device::is_gpu()` +* `device::is_accelerator()` + +In addition, the following information descriptors are also redundant: + +* `info::device::extensions` +* `info::platform::extensions` + + +== Impact on the SYCL Generalization Proposal + +Since there is some overlap between this proposal and +https://github.com/KhronosGroup/SYCL-Shared/blob/master/proposals/sycl_generalization.md[ +Generalization of the SYCL specification], this proposal suggests the following +changes to "Generalization of the SYCL specification". + +* We suggest that the name of the `sycl::is_active` trait be changed to + `sycl::is_backend_active`. This helps distinguish that trait from + `sycl::is_aspect_active` which is defined in this proposal. + +* We suggest that references to extensions be removed from "Generalization of + the SYCL specification" since that is covered in this proposal. In + particular, the following statement should be removed: ++ +---- +The path SYCL/vendor/vendor_name/extension.hpp, and the namespace +sycl::vendor::vendor_name are reserved for vendor-specific extensions. +Generic SYCL extensions are exposed via individual headers in +SYCL/ext/ext_name.hpp. +---- + +* We suggest that all references to the OpenCL extension names (e.g. + `cl_khr_fp16`) be removed from the core SYCL specification and added to the + OpenCL backend document. The concept of OpenCL extensions could be exposed + as "aspects" (e.g. `dev.has(sycl::opencl::aspect::cl_khr_fp16)`), but the + aspect names should be defined in `SYCL/backend/opencl.hpp` and explained in + the OpenCL backend document. + +* We suggest that naming for vendor specific backends change to be in alignment + with the naming of vendor specific extensions that are defined in this + proposal. In particular, we think all vendor specific additions (both + extensions and backends) should be defined in a common namespace and their + headers should reside in a common subdirectory. Specific changes we suggest + are: + + - The header location for a vendor's backend interoperability API header + should be changed to `SYCL/ext//backend/.hpp`. + + - The namespace for a vendor's backend interoperability API should be changed + to `sycl::ext::::`. + + - The macro name for a vendor's backend should be renamed to + `SYCL_EXT__BACKEND_`, where `` and + `` are in all capitals. + +* We suggest that the name of the enumerated constant for a vendor specific + backend change to be in alignment with the style we propose for vendor + specific aspects. For example, backend names approved by the SYCL group + would still be defined like this: ++ +[source,c++,NoName,linenums] +---- +namespace sycl { + +enum class backend { + host, + opencl // When implementation provides OpenCL backend +}; + +} // namespace sycl +---- ++ +But vendor specific backend names would be defined like this: ++ +[source,c++,NoName,linenums] +---- +namespace sycl { +namespace ext { +namespace acme { +namespace backend { + +static constexpr sycl::backend foo = static_cast(-1); + +} // namespace backend +} // namespace acme +} // namespace ext +} // namespace sycl +---- + + +== Revision History + +[cols="5,15,15,70"] +[grid="rows"] +[options="header"] +|======================================== +|Rev|Date|Author|Changes +|1|2020-04-28|Greg Lueck|*Initial public working draft* +|======================================== + +//************************************************************************ +//Other formatting suggestions: +// +//* Use *bold* text for host APIs, or [source] syntax highlighting. +//* Use +mono+ text for device APIs, or [source] syntax highlighting. +//* Use +mono+ text for extension names, types, or enum values. +//* Use _italics_ for parameters. +//************************************************************************