diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 9aa8539299a31..afe6039c57552 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,7 +1,7 @@ # Contributing ## License -Intel Project for LLVM* technology is licensed under the terms of the +Intel Project for LLVM\* technology is licensed under the terms of the Apache-2.0 with LLVM-exception license ([LICENSE.txt](llvm/LICENSE.TXT)) to ensure our ability to contribute this project to the LLVM project under the same license. @@ -70,11 +70,11 @@ commit automatically with `git commit -s`. ### Development - Create a personal fork of the project on GitHub -- Use **sycl** branch as baseline for your changes + - For the DPC++ Compiler project, use **sycl** branch as baseline for your + changes. See [Get Started Guide](sycl/doc/GetStartedGuide.md). - Prepare your patch (follow [LLVM coding standards](https://llvm.org/docs/CodingStandards.html)) -- Build the project and run all tests (see -[GetStartedWithSYCLCompiler.md](sycl/doc/GetStartedWithSYCLCompiler.md)) +- Build the project and run all tests ### Review and acceptance testing @@ -94,4 +94,8 @@ Project maintainers merge pull requests using one of the following options: - [Squash and merge] Used when there are multiple commits in the PR - Squashing is done to make sure that the project is buildable on any commit - [Create a merge commit] Used for LLVM pull-down PRs to preserve hashes of the -commits pulled from the LLVM community repository \ No newline at end of file +commits pulled from the LLVM community repository + + +*Other names and brands may be claimed as the property of others. + diff --git a/README.md b/README.md index fdf6c8d663cfb..5abac7ca238d9 100644 --- a/README.md +++ b/README.md @@ -1,24 +1,28 @@ -# Intel Project for LLVM* technology +# Intel Project for LLVM\* technology ## Introduction Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects: - - SYCL* Compiler and Runtimes - compiler and runtime libraries for SYCL ([https://www.khronos.org/sycl/](https://www.khronos.org/sycl/)). See **sycl** branch. + - oneAPI Data Parallel C++ compiler - see **sycl** branch. More information on + oneAPI and DPC++ is available at +([https://www.oneapi.com/](https://www.oneapi.com/)) ## License See [LICENSE.txt](sycl/LICENSE.TXT) for details. - ## Contributing See [CONTRIBUTING.md](CONTRIBUTING.md) for details. ## Sub-projects Documentation - - SYCL Compiler and Runtimes - See [GetStartedWithSYCLCompiler.md](sycl/doc/GetStartedWithSYCLCompiler.md) + - oneAPI Data Parallel C++ compiler - See + [GetStartedGuide.md](sycl/doc/GetStartedGuide.md) -*Other names and brands may be claimed as the property of others. +## DPC++ extensions -## SYCL Extension Proposal Documents +DPC++ is an open, cross-architecture language built upon the ISO C++ and Khronos +SYCL\* standards. DPC++ extends these standards with a number of extensions, +which can be found in [sycl/doc/extensions](sycl/doc/extensions) directory. -See [sycl/doc/extensions](sycl/doc/extensions) +\*Other names and brands may be claimed as the property of others. diff --git a/sycl/doc/SYCLCompilerAndRuntimeDesign.md b/sycl/doc/CompilerAndRuntimeDesign.md similarity index 94% rename from sycl/doc/SYCLCompilerAndRuntimeDesign.md rename to sycl/doc/CompilerAndRuntimeDesign.md index 701c8f60b978b..10148cab00ba0 100644 --- a/sycl/doc/SYCLCompilerAndRuntimeDesign.md +++ b/sycl/doc/CompilerAndRuntimeDesign.md @@ -1,18 +1,18 @@ -# SYCL\* Compiler and Runtime architecture design +# oneAPI DPC++ Compiler and Runtime architecture design ## Introduction -This document describes the architecture of the SYCL compiler and runtime -library. Base SYCL specification version is -[1.2.1](https://www.khronos.org/registry/SYCL/specs/sycl-1.2.1.pdf). +This document describes the architecture of the DPC++ compiler and runtime +library. For DPC++ specification see +[spec](https://spec.oneapi.com/versions/latest/elements/dpcpp/source/index.html). -## SYCL Compiler architecture +## DPC++ Compiler architecture -SYCL application compilation flow: +DPC++ application compilation flow: -![High level component diagram for SYCL Compiler](Compiler-HLD.svg) +![High level component diagram for DPC++ Compiler](images/Compiler-HLD.svg) -SYCL compiler logically can be split into the host compiler and a number of +DPC++ compiler logically can be split into the host compiler and a number of device compilers—one per each supported target. Clang driver orchestrates the compilation process, it will invoke the device compiler once per each requested target, then it will invoke the host compiler to compile the host part of a @@ -31,7 +31,7 @@ applies additional restrictions on the device code (e.g. no exceptions or virtual calls), generates LLVM IR for the device code only and "integration header" which provides information like kernel name, parameters order and data type for the runtime library. -- **Middle-end** - transforms the initial LLVM IR* to get consumed by the +- **Middle-end** - transforms the initial LLVM IR to get consumed by the back-end. Today middle-end transformations include just a couple of passes: - Optionally: Address space inference pass - TBD: potentially the middle-end optimizer can run any LLVM IR @@ -78,7 +78,7 @@ Q.submit([&](handler& cgh) { ... ``` -In this example, the SYCL compiler needs to compile the lambda expression passed +In this example, the compiler needs to compile the lambda expression passed to the `cl::sycl::handler::parallel_for` method, as well as the function `foo` called from the lambda expression for the device. The compiler must also ignore the `bar` function when we compile the @@ -87,9 +87,9 @@ portion of the source code (the contents of the lambda expression passed to the `cl::sycl::handler::parallel_for` and any function called from this lambda expression). -The current approach is to use the SYCL kernel attribute in the SYCL runtime to +The current approach is to use the SYCL kernel attribute in the runtime to mark code passed to `cl::sycl::handler::parallel_for` as "kernel functions". -The SYCL runtime library can't mark foo as "device" code - this is a compiler +The runtime library can't mark foo as "device" code - this is a compiler job: to traverse all symbols accessible from kernel functions and add them to the "device part" of the code marking them with the new SYCL device attribute. @@ -160,8 +160,8 @@ must be passed to the clang driver: `-fsycl` -With this option specified, the driver will invoke the host SYCL compiler and a -number of device compilers for targets specified in the `-fsycl-targets` +With this option specified, the driver will invoke the host compiler and a +number of SYCL device compilers for targets specified in the `-fsycl-targets` option. If `-fsycl-targets` is not specified, then single SPIR-V target is assumed, and single device compiler for this target is invoked. @@ -188,7 +188,7 @@ a set of target architectures for which to compile device code. By default the compiler generates SPIR-V and OpenCL device JIT compiler produces native target binary. -There are existing options for OpenMP* offload: +There are existing options for OpenMP\* offload: `-fopenmp-targets=triple1,triple2` @@ -477,7 +477,7 @@ produced by OpenCL C front-end compiler. It's a regular function, which can conflict with user code produced from C++ source. -SYCL compiler uses modified solution developed for OpenCL C++ compiler +DPC++ compiler uses modified solution developed for OpenCL C++ compiler prototype: - Compiler: https://github.com/KhronosGroup/SPIR/tree/spirv-1.1 @@ -546,17 +546,11 @@ compiler: ### Compiler/Runtime interface -## SYCL Runtime architecture +## DPC++ Runtime architecture *TBD* -## Supported extensions +## DPC++ Language extensions to SYCL -- [Intel subgroups](extensions/SubGroupNDRange/SubGroupNDRange.md) +List of language extensions can be found at [extensions](extensions) -## Unsupported extensions/proposals - -- [Ordered queue](extensions/OrderedQueue/OrderedQueue.adoc) -- [Unified shared memory](extensions/USM/USM.adoc) - -\*Other names and brands may be claimed as the property of others. diff --git a/sycl/doc/SYCLEnvironmentVariables.md b/sycl/doc/EnvironmentVariables.md similarity index 78% rename from sycl/doc/SYCLEnvironmentVariables.md rename to sycl/doc/EnvironmentVariables.md index f7c538b27d856..02b0bd5074deb 100644 --- a/sycl/doc/SYCLEnvironmentVariables.md +++ b/sycl/doc/EnvironmentVariables.md @@ -1,32 +1,32 @@ # Overview -This file describes environment variables that are having effect on SYCL compiler and run-time. +This file describes environment variables that are having effect on DPC++ compiler and runtime. -# Controlling SYCL RT +# Controlling DPC++ RT **Warning:** the environment variables described in this document are used for -development and debugging of SYCL runtime and compiler. Their semantics are +development and debugging of DPC++ compiler and runtime. Their semantics are subject to change. Do not rely on these variables in production code. | Environment variable | Values | Description | | -------------------- | ------ | ----------- | -| SYCL_PI_TRACE | Any(*) | Force tracing of PI calls to stderr. | +| SYCL_PI_TRACE | Any(\*) | Force tracing of PI calls to stderr. | | SYCL_BE | PI_OPENCL, PI_OTHER | When SYCL RT is built with PI this controls which plugin to use. Default value is PI_OPENCL. | | SYCL_DEVICE_TYPE | One of: CPU, GPU, ACC, HOST | Force SYCL to use the specified device type. If unset, default selection rules are applied. If set to any unlisted value, this control has no effect. If the requested device type is not found, a `cl::sycl::runtime_error` exception is thrown. If a non-default device selector is used, a device must satisfy both the selector and this control to be chosen. This control only has effect on devices created with a selector. | | SYCL_PROGRAM_COMPILE_OPTIONS | String of valid OpenCL compile options | Override compile options for all programs. | | SYCL_PROGRAM_LINK_OPTIONS | String of valid OpenCL link options | Override link options for all programs. | | SYCL_USE_KERNEL_SPV | Path to the SPIR-V binary | Load device image from the specified file. If runtime is unable to read the file, `cl::sycl::runtime_error` exception is thrown.| -| SYCL_DUMP_IMAGES | Any(*) | Dump device image binaries to file. Control has no effect if SYCL_USE_KERNEL_SPV is set. | +| SYCL_DUMP_IMAGES | Any(\*) | Dump device image binaries to file. Control has no effect if SYCL_USE_KERNEL_SPV is set. | | SYCL_PRINT_EXECUTION_GRAPH | Described [below](#sycl_print_execution_graph-options) | Print execution graph to DOT text file. | -| SYCL_DISABLE_EXECUTION_GRAPH_CLEANUP | Any(*) | Disable cleanup of finished command nodes at host-device synchronization points. | -| SYCL_THROW_ON_BLOCK | Any(*) | Throw an exception on attempt to wait for a blocked command. | +| SYCL_DISABLE_EXECUTION_GRAPH_CLEANUP | Any(\*) | Disable cleanup of finished command nodes at host-device synchronization points. | +| SYCL_THROW_ON_BLOCK | Any(\*) | Throw an exception on attempt to wait for a blocked command. | | SYCL_DEVICELIB_INHIBIT_NATIVE | String of device library extensions (separated by a whitespace) | Do not rely on device native support for devicelib extensions listed in this option. | -| SYCL_DEVICE_ALLOWLIST | A list of devices and their minimum driver version following the pattern: DeviceName:{{XXX}},DriverVersion:{{X.Y.Z.W}}. Also may contain PlatformName and PlatformVersion | Filter out devices that do not match the pattern specified. Regular expression can be passed and the SYCL RT will select only those devices which satisfy the regex. | +| SYCL_DEVICE_ALLOWLIST | A list of devices and their minimum driver version following the pattern: DeviceName:{{XXX}},DriverVersion:{{X.Y.Z.W}}. Also may contain PlatformName and PlatformVersion | Filter out devices that do not match the pattern specified. Regular expression can be passed and the DPC++ runtime will select only those devices which satisfy the regex. | `(*) Note: Any means this environment variable is effective when set to any non-null value.` ## SYCL_PRINT_EXECUTION_GRAPH Options -SYCL_PRINT_EXECUTION_GRAPH can accept one or more comma separated values from table below +SYCL_PRINT_EXECUTION_GRAPH can accept one or more comma separated values from the table below | Option | Description | | ------ | ----------- | diff --git a/sycl/doc/FAQ.md b/sycl/doc/FAQ.md index 7c4d44452cbb1..6fc5c71213e93 100644 --- a/sycl/doc/FAQ.md +++ b/sycl/doc/FAQ.md @@ -2,51 +2,51 @@ **Table of contents** -1. [Developing with SYCL](#developing-with-sycl) -1. [Using applications built with SYCL](#using-applications-built-with-sycl) +1. [Developing with DPC++](#developing-with-dpc) +1. [Using applications built with DPC++](#using-applications-built-with-dpc) 1. [Common issues](#common-issues) 1. [Device specific questions and issues](#device-specific-questions-and-issues) -## Developing with SYCL +## Developing with DPC++ -### Q: What do I need to start developing with SYCL? -**A:** To get the full SYCL experience you need a SYCL-capable compiler. Intel -SYCL compiler provides you with both host and device side compilation. Another +### Q: What do I need to start developing with DPC++? +**A:** To get the full DPC++ experience you need oneAPI DPC++ compiler. DPC++ +compiler provides you with both host and device side compilation. Another requirement for code offloading to specialized devices is a compatible OpenCL -runtime. Our [Get Started Guide](GetStartedWithSYCLCompiler.md) will help you -set up a proper environment. To learn more about using the SYCL compiler, please -refer to [User Manual](SYCLCompilerUserManual.md). If using a special compiler +runtime. Our [Get Started Guide](GetStartedGuide.md) will help you +set up a proper environment. To learn more about using the DPC++ compiler, +please refer to [Users Manual](UsersManual.md). If using a special compiler is not an option for you and/or you would like to experiment without offloading -code to non-host devices, you can exploit SYCL's host device feature. This gives -you the ability to use any C++11 compiler. You will need to link your -application with the SYCL Runtime library and provide a path to the SYCL headers -directory. Please, refer to your compiler manual to learn about specific build -options. +code to non-host devices, you can exploit SYCL's host device feature. This +gives you the ability to use any C++11 compiler. You will need to link your +application with the DPC++ Runtime library and provide a path to the SYCL +headers directory. Please, refer to your compiler manual to learn about +specific build options. -### Q: How are SYCL compilation phases different from those of a usual C++ compiler? Can I customize this flow for my applications? +### Q: How are DPC++ compilation phases different from those of a usual C++ compiler? Can I customize this flow for my applications? **A:** Due to the fact that both host and device code need to be compiled and -linked into the final binary, the compilation steps sequence is more complicated -compared to the usual C++ flow. +linked into the final binary, the compilation steps sequence is more +complicated compared to the usual C++ flow. -In general, we encourage our users to rely on the SYCL Compiler for handling all -of the compilation phases "under the hood". However, thorough understanding of -the above-described steps may allow you to customize your compilation by invoking -different phases manually. As an example, you could: +In general, we encourage our users to rely on the DPC++ Compiler for handling +all of the compilation phases "under the hood". However, thorough understanding +of the above-described steps may allow you to customize your compilation by +invoking different phases manually. As an example, you could: 1. preprocess your host code with another C++-capable compiler; -2. turn to the SYCL compiler for generating the integration header and compiling -the device code for the needed target(s); +2. turn to the DPC++ compiler for generating the integration header and +compiling the device code for the needed target(s); 3. use your preferred host compiler from 1) to compile your preprocessed host code and the integration header into a host object file; 4. link the host object file and the device image(s) into the final executable. -To learn more about the concepts behind this flow, and the SYCL Compiler +To learn more about the concepts behind this flow, and the DPC++ Compiler internals as such, we welcome you to study our -[SYCL Compiler and Runtime architecture design](SYCLCompilerAndRuntimeDesign.md) +[DPC++ Compiler and Runtime architecture design](CompilerAndRuntimeDesign.md) document. -## Using applications built with SYCL +## Using applications built with DPC++ ### Q: What happens if I run my application on a machine without OpenCL? **A:** If you use the default SYCL device selector (or any other selector that @@ -56,7 +56,7 @@ Otherwise, an exception will be thrown. ## Common issues -### Q: SYCL application complains about missing libsycl.so (or sycl.dll) library. +### Q: DPC++ application complains about missing libsycl.so (or sycl.dll) library. Linux: ``` $ ./app @@ -66,15 +66,16 @@ Windows: ![Error screen](images/missing_sycl_dll.png) -*The code execution cannot proceed because sycl.dll was not found. Reinstalling the program may fix this problem.* +*The code execution cannot proceed because sycl.dll was not found. Reinstalling +the program may fix this problem.* -**A:** The SYCL Runtime library is required to run SYCL-enabled applications. +**A:** The DPC++ Runtime library is required to run DPC++ applications. While compiler driver is able to find the library and link against it, your -operating system may struggle. Make sure that the location of the SYCL Runtime +operating system may struggle. Make sure that the location of the DPC++ Runtime library is listed in the correct environment variable: `LD_LIBRARY_PATH` (for Linux) or `LIB` (for Windows). -### Q: SYCL fails to compile device code that uses STD functions. +### Q: DPC++ Compiler fails to compile device code that uses STD functions. Example error message: ``` In file included from example.cpp:1: @@ -119,10 +120,10 @@ specification. ## Device specific questions and issues -### Q: What devices are supported by Intel SYCL compiler? -**A:** By design, SYCL is closely connected to OpenCL, which is used to offload -code to specialized devices. Intel SYCL compiler currently makes use of SPIR-V, -a portable intermediate representation format. It is a core feature of +### Q: What devices are supported by DPC++ compiler? +**A:** By design, DPC++ and SYCL are closely connected to OpenCL, which is used +to offload code to specialized devices. DPC++ compiler currently makes use of +SPIR-V, a portable intermediate representation format. It is a core feature of OpenCL 2.1, so any device, capable of OpenCL 2.1, should be supported. Otherwise, your OpenCL device must support `cl_khr_il_program` extension. @@ -132,17 +133,17 @@ the offload target for kernel execution. Since the device code is also compiled for the host CPU and no JIT is required, you can easily use any classic C++ debugging tools of your choice for the host device code. -Furthermore, developers can extend capabilities of the SYCL Runtime to -non-OpenCL devices by writing correspondent plugins. To learn more, please check -out our [Plugin Interface Guide](SYCLPluginInterface.md). +Furthermore, developers can extend capabilities of the DPC++ Runtime to +non-OpenCL devices by writing correspondent plugins. To learn more, please +check out our [Plugin Interface Guide](PluginInterface.md). -### Q: SYCL applications hang on Intel GPUs while working well on other devices +### Q: DPC++ applications hang on Intel GPUs while working well on other devices **A:** One of the common reasons is Intel GPUs feature called "hang check". If your workload runs for more than a certain amount of time, it will be killed -by hardware. From the application point of view this looks like a hang. To allow -heavy kernels to be executed, disable hang check. **Please, note that other apps -on your system may contain bugs, and disabling "hang check" may lead to real -hangs.** +by hardware. From the application point of view this looks like a hang. To +allow heavy kernels to be executed, disable hang check. **Please, note that +other apps on your system may contain bugs, and disabling "hang check" may lead +to real hangs.** You can find out more about hang check and how to disable it on [this page](https://software.intel.com/en-us/articles/installation-guide-for-intel-oneapi-toolkits). diff --git a/sycl/doc/GetStartedWithSYCLCompiler.md b/sycl/doc/GetStartedGuide.md similarity index 72% rename from sycl/doc/GetStartedWithSYCLCompiler.md rename to sycl/doc/GetStartedGuide.md index 4b240eeabcac7..6a60594bbee24 100644 --- a/sycl/doc/GetStartedWithSYCLCompiler.md +++ b/sycl/doc/GetStartedGuide.md @@ -1,20 +1,19 @@ # Overview -The SYCL* Compiler compiles C++\-based SYCL source files with code for both CPU -and a wide range of compute accelerators. The compiler uses Khronos* -OpenCL™ API to offload computations to accelerators. +The DPC++ Compiler compiles C++ and SYCL\* source files with code for both CPU +and a wide range of compute accelerators such as GPU and FPGA. # Table of contents * [Prerequisites](#prerequisites) - * [Create SYCL workspace](#create-sycl-workspace) -* [Build SYCL toolchain](#build-sycl-toolchain) - * [Build SYCL toolchain with libc++ library](#build-sycl-toolchain-with-libc-library) - * [Build SYCL toolchain with support for NVIDIA CUDA](#build-sycl-toolchain-with-support-for-nvidia-cuda) -* [Use SYCL toolchain](#use-sycl-toolchain) + * [Create DPC++ workspace](#create-dpc-workspace) +* [Build DPC++ toolchain](#build-dpc-toolchain) + * [Build DPC++ toolchain with libc++ library](#build-dpc-toolchain-with-libc-library) + * [Build DPC++ toolchain with support for NVIDIA CUDA](#build-dpc-toolchain-with-support-for-nvidia-cuda) +* [Use DPC++ toolchain](#use-dpc-toolchain) * [Install low level runtime](#install-low-level-runtime) - * [Test SYCL toolchain](#test-sycl-toolchain) - * [Run simple SYCL application](#run-simple-sycl-application) + * [Test DPC++ toolchain](#test-dpc-toolchain) + * [Run simple DPC++ application](#run-simple-dpc-application) * [C++ standard](#c-standard) * [Known Issues and Limitations](#known-issues-and-limitations) * [CUDA backend limitations](#cuda-backend-limitations) @@ -31,82 +30,85 @@ OpenCL™ API to offload computations to accelerators. * Windows: `Visual Studio` version 15.7 preview 4 or later - https://visualstudio.microsoft.com/downloads/ -## Create SYCL workspace +## Create DPC++ workspace -Throughout this document `SYCL_HOME` denotes the path to the local directory -created as SYCL workspace. It might be useful to create an environment variable +Throughout this document `DPCPP_HOME` denotes the path to the local directory +created as DPC++ workspace. It might be useful to create an environment variable with the same name. **Linux** ```bash -export SYCL_HOME=/export/home/sycl_workspace -mkdir $SYCL_HOME +export DPCPP_HOME=/export/home/sycl_workspace +mkdir $DPCPP_HOME ``` **Windows (64-bit)** Open a developer command prompt using one of two methods: -- Click start menu and search for "**x64** Native Tools Command Prompt for VS XXXX", where - XXXX is a version of installed Visual Studio. +- Click start menu and search for "**x64** Native Tools Command Prompt for VS + XXXX", where XXXX is a version of installed Visual Studio. - Ctrl-R, write "cmd", click enter, then run `"C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Auxiliary\Build\vcvarsall.bat" x64` ```bat -set SYCL_HOME=%USERPROFILE%\sycl_workspace -mkdir %SYCL_HOME% +set DPCPP_HOME=%USERPROFILE%\sycl_workspace +mkdir %DPCPP_HOME% ``` -# Build SYCL toolchain +# Build DPC++ toolchain **Linux** ```bash -cd $SYCL_HOME +cd $DPCPP_HOME git clone https://github.com/intel/llvm -b sycl -mkdir $SYCL_HOME/build -cd $SYCL_HOME/build +mkdir $DPCPP_HOME/build +cd $DPCPP_HOME/build cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_TARGETS_TO_BUILD="X86" \ -DLLVM_EXTERNAL_PROJECTS="llvm-spirv;sycl" \ -DLLVM_ENABLE_PROJECTS="clang;llvm-spirv;sycl" \ --DLLVM_EXTERNAL_SYCL_SOURCE_DIR=$SYCL_HOME/llvm/sycl \ --DLLVM_EXTERNAL_LLVM_SPIRV_SOURCE_DIR=$SYCL_HOME/llvm/llvm-spirv \ -$SYCL_HOME/llvm/llvm +-DLLVM_EXTERNAL_SYCL_SOURCE_DIR=$DPCPP_HOME/llvm/sycl \ +-DLLVM_EXTERNAL_LLVM_SPIRV_SOURCE_DIR=$DPCPP_HOME/llvm/llvm-spirv \ +$DPCPP_HOME/llvm/llvm make -j`nproc` sycl-toolchain ``` **Windows (64-bit)** ```bat -cd %SYCL_HOME% +cd %DPCPP_HOME% git clone https://github.com/intel/llvm -b sycl -mkdir %SYCL_HOME%\build -cd %SYCL_HOME%\build +mkdir %DPCPP_HOME%\build +cd %DPCPP_HOME%\build cmake -G "Ninja" -DCMAKE_BUILD_TYPE=Release -DLLVM_TARGETS_TO_BUILD="X86" ^ -DLLVM_EXTERNAL_PROJECTS="llvm-spirv;sycl" ^ -DLLVM_ENABLE_PROJECTS="clang;llvm-spirv;sycl" ^ --DLLVM_EXTERNAL_SYCL_SOURCE_DIR="%SYCL_HOME%\llvm\sycl" ^ --DLLVM_EXTERNAL_LLVM_SPIRV_SOURCE_DIR="%SYCL_HOME%\llvm\llvm-spirv" ^ +-DLLVM_EXTERNAL_SYCL_SOURCE_DIR="%DPCPP_HOME%\llvm\sycl" ^ +-DLLVM_EXTERNAL_LLVM_SPIRV_SOURCE_DIR="%DPCPP_HOME%\llvm\llvm-spirv" ^ -DCMAKE_C_COMPILER=cl -DCMAKE_CXX_COMPILER=cl ^ -"%SYCL_HOME%\llvm\llvm" +"%DPCPP_HOME%\llvm\llvm" ninja sycl-toolchain ``` -To use ahead-of-time compilation for the Intel® processors, additionally build opencl-aot target: - -1. add ```opencl-aot``` to ```-DLLVM_EXTERNAL_PROJECTS``` and ```-DLLVM_ENABLE_PROJECTS``` variables above -2. add ```opencl-aot``` to ```make``` (for Linux) or ```ninja``` (for Windows) commands above +To use ahead-of-time compilation for the Intel® processors, additionally +build opencl-aot target: + +1. add ```opencl-aot``` to ```-DLLVM_EXTERNAL_PROJECTS``` and +```-DLLVM_ENABLE_PROJECTS``` variables above +2. add ```opencl-aot``` to +```make``` (for Linux) or ```ninja``` (for Windows) commands above For more, see [opencl-aot documentation](../../opencl-aot/README.md). -TODO: add instructions how to deploy built SYCL toolchain. +TODO: add instructions how to deploy built DPC++ toolchain. -## Build SYCL toolchain with libc++ library +## Build DPC++ toolchain with libc++ library -There is experimental support for building and linking SYCL runtime with +There is experimental support for building and linking DPC++ runtime with libc++ library instead of libstdc++. To enable it the following CMake options should be used. @@ -117,41 +119,41 @@ should be used. -DSYCL_LIBCXX_LIBRARY_PATH= ``` -## Build SYCL toolchain with support for NVIDIA CUDA +## Build DPC++ toolchain with support for NVIDIA CUDA -There is experimental support for SYCL for CUDA devices. +There is experimental support for DPC++ for CUDA devices. -To enable support for CUDA devices, the following arguments need to be added to -the CMake command when building the SYCL compiler. +To enable support for CUDA devices, the following arguments need to be added to +the CMake command when building the DPC++ compiler. ``` -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda/ \ --DLLVM_ENABLE_PROJECTS="clang;llvm-spirv;sycl;libclc"\ --DSYCL_BUILD_PI_CUDA=ON\ --DLLVM_TARGETS_TO_BUILD="X86;NVPTX"\ +-DLLVM_ENABLE_PROJECTS="clang;llvm-spirv;sycl;libclc" \ +-DSYCL_BUILD_PI_CUDA=ON \ +-DLLVM_TARGETS_TO_BUILD="X86;NVPTX" \ -DLIBCLC_TARGETS_TO_BUILD="nvptx64--;nvptx64--nvidiacl" ``` -Enabling this flag requires an installation of +Enabling this flag requires an installation of [CUDA 10.1](https://developer.nvidia.com/cuda-10.1-download-archive-update2) on the system, -refer to +refer to [NVIDIA CUDA Installation Guide for Linux](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html). Currently, the only combination tested is Ubuntu 18.04 with CUDA 10.2 using a Titan RTX GPU (SM 71), but it should work on any GPU compatible with SM 50 or above. -# Use SYCL toolchain +# Use DPC++ toolchain ## Install low level runtime -To run SYCL applications on OpenCL devices, OpenCL implementation(s) must be +To run DPC++ applications on OpenCL devices, OpenCL implementation(s) must be present in the system. Please, refer to [the Release Notes](../ReleaseNotes.md) for recommended Intel runtime versions. -The `GPU` runtime that is needed to run SYCL application on Intel `GPU` devices +The `GPU` runtime that is needed to run DPC++ application on Intel `GPU` devices can be downloaded from the following web pages: * Linux: [Intel® Graphics Compute Runtime for @@ -163,8 +165,8 @@ can be downloaded from the following web pages: To install Intel `CPU` runtime for OpenCL devices the corresponding runtime asset/archive should be downloaded from -[SYCL Compiler and Runtime updates](../ReleaseNotes.md) and installed using -the following procedure. +[DPC++ Compiler and Runtime updates](../ReleaseNotes.md) and installed following +procedure below. Intel `CPU` runtime for OpenCL depends on Threading Building Blocks library which should be downloaded from [Threading Building Blocks (TBB) @@ -202,7 +204,8 @@ cd /opt/intel/tbb_ tar -zxvf tbb*lin.tgz ``` -4) Copy files from or create symbolic links to TBB libraries in OpenCL RT folder: +4) Copy files from or create symbolic links to TBB libraries in OpenCL RT +folder: ```bash ln -s /opt/intel/tbb_/tbb/lib/intel64/gcc4.8/libtbb.so /opt/intel/oclcpuexp/x64/libtbb.so @@ -216,7 +219,7 @@ ln -s /opt/intel/tbb_/tbb/lib/intel64/gcc4.8/libtbbmalloc.so.2 5) Configure library paths ```bash -echo /opt/intel/oclcpuexp_/x64 > +echo /opt/intel/oclcpuexp_/x64 > /etc/ld.so.conf.d/libintelopenclexp.conf ldconfig -f /etc/ld.so.conf.d/libintelopenclexp.conf ``` @@ -225,8 +228,8 @@ ldconfig -f /etc/ld.so.conf.d/libintelopenclexp.conf installing `CPU` runtime as `GPU` runtime installer may re-write some important files or settings and make existing `CPU` runtime not working properly. -2) Extract the archive to some folder. For example, to `c:\oclcpu_rt_` -and `c:\tbb_`. +2) Extract the archive to some folder. For example, to +`c:\oclcpu_rt_` and `c:\tbb_`. 3) Run `Command Prompt` as `Administrator`. To do that click `Start` button, type `Command Prompt`, click the Right mouse button on it, then click @@ -240,11 +243,11 @@ command: c:\oclcpu_rt_\install.bat c:\tbb_\tbb\bin\intel64\vc14 ``` -## Test SYCL toolchain +## Test DPC++ toolchain ### Run regression tests -To verify that built SYCL toolchain is working correctly, run: +To verify that built DPC++ toolchain is working correctly, run: **Linux** ```bash @@ -259,27 +262,28 @@ ninja check-all If no OpenCL GPU/CPU runtimes are available, the corresponding tests are skipped. -### Run Khronos SYCL conformance test suite (optional) +### Run Khronos\* SYCL\* conformance test suite (optional) -Khronos SYCL conformance test suite (CTS) is intended to validate SYCL -implementation conformance to Khronos SYCL specification. +Khronos\* SYCL\* conformance test suite (CTS) is intended to validate +implementation conformance to Khronos\* SYCL\* specification. DPC++ compiler is +expected to pass significant number of tests, and it keeps improving. -Follow Khronos SYCL-CTS instructions from +Follow Khronos\* SYCL\* CTS instructions from [README](https://github.com/KhronosGroup/SYCL-CTS#sycl-121-conformance-test-suite) file to obtain test sources and instructions how build and execute the tests. -To configure testing of "Intel SYCL" toochain set +To configure testing of DPC++ toochain set `SYCL_IMPLEMENTATION=Intel_SYCL` and `Intel_SYCL_ROOT=` CMake variables. **Linux** ```bash -cmake -DIntel_SYCL_ROOT=$SYCL_HOME/deploy -DSYCL_IMPLEMENTATION=Intel_SYCL ... +cmake -DIntel_SYCL_ROOT=$DPCPP_HOME/deploy -DSYCL_IMPLEMENTATION=Intel_SYCL ... ``` **Windows (64-bit)** ```bat -cmake -DIntel_SYCL_ROOT=%SYCL_HOME%\deploy -DSYCL_IMPLEMENTATION=Intel_SYCL ... +cmake -DIntel_SYCL_ROOT=%DPCPP_HOME%\deploy -DSYCL_IMPLEMENTATION=Intel_SYCL ... ``` ### Build Doxygen documentation @@ -298,9 +302,9 @@ command: After CMake cache is generated, build the documentation with `doxygen-sycl` target. It will be put to `/path/to/build/tools/sycl/doc/html` directory. -## Run simple SYCL application +## Run simple DPC++ application -A simple SYCL program consists of following parts: +A simple DPC++ or SYCL\* program consists of following parts: 1. Header section 2. Allocating buffer for data 3. Creating SYCL queue @@ -309,7 +313,7 @@ A simple SYCL program consists of following parts: 6. Use buffer accessor to retrieve the result on the device and verify the data 7. The end -Creating a file `simple-sycl-app.cpp` with the following C++ SYCL code in it: +Creating a file `simple-sycl-app.cpp` with the following C++/SYCL code: ```c++ #include @@ -364,14 +368,14 @@ To build simple-sycl-app put `bin` and `lib` to PATHs: **Linux** ```bash -export PATH=$SYCL_HOME/build/bin:$PATH -export LD_LIBRARY_PATH=$SYCL_HOME/build/lib:$LD_LIBRARY_PATH +export PATH=$DPCPP_HOME/build/bin:$PATH +export LD_LIBRARY_PATH=$DPCPP_HOME/build/lib:$LD_LIBRARY_PATH ``` **Windows (64-bit)** ```bat -set PATH=%SYCL_HOME%\build\bin;%PATH% -set LIB=%SYCL_HOME%\build\lib;%LIB% +set PATH=%DPCPP_HOME%\build\bin;%PATH% +set LIB=%DPCPP_HOME%\build\lib;%LIB% ``` and run following command: @@ -400,14 +404,14 @@ if clang was built with the cmake option `SYCL_BUILD_PI_CUDA=ON`. The results are correct! ``` **Note**: -Currently, when the application has been built with the CUDA target, the CUDA backend -must be selected at runtime using the `SYCL_BE` environment variable. +Currently, when the application has been built with the CUDA target, the CUDA +backend must be selected at runtime using the `SYCL_BE` environment variable. ```bash SYCL_BE=PI_CUDA ./simple-sycl-app-cuda.exe ``` -NOTE: SYCL developer can specify SYCL device for execution using device +NOTE: DPC++/SYCL developer can specify SYCL device for execution using device selectors (e.g. `cl::sycl::cpu_selector`, `cl::sycl::gpu_selector`, [Intel FPGA selector(s)](extensions/IntelFPGA/FPGASelector.md)) as explained in following section [Code the program for a specific @@ -479,26 +483,34 @@ class CUDASelector : public cl::sycl::device_selector { # C++ standard -- Minimally support C++ standard is c++11 on Linux and c++14 on Windows. +- Minimal supported C++ standard is C++11 on Linux and C++14 on Windows. # Known Issues and Limitations -- SYCL device compiler fails if the same kernel was used in different +- DPC++ device compiler fails if the same kernel was used in different translation units. - SYCL host device is not fully supported. - 32-bit host/target is not supported. -- SYCL works only with OpenCL implementations supporting out-of-order queues. -- On Windows linking SYCL applications with `/MTd` flag is known to cause crashes. +- DPC++ works only with OpenCL low level runtimes which support out-of-order + queues. +- On Windows linking DPC++ applications with `/MTd` flag is known to cause + crashes. ## CUDA back-end limitations -- Backend is only supported on Linux -- The only combination tested is Ubuntu 18.04 with CUDA 10.2 using -a Titan RTX GPU (SM 71), but it should work on any GPU compatible with SM 50 or -above -- The NVIDIA OpenCL headers conflict with the OpenCL headers required for this project -and may cause compilation issues on some platforms +- Backend is only supported on Linux +- The only combination tested is Ubuntu 18.04 with CUDA 10.2 using a Titan RTX + GPU (SM 71), but it should work on any GPU compatible with SM 50 or above +- The NVIDIA OpenCL headers conflict with the OpenCL headers required for this + project and may cause compilation issues on some platforms # Find More -SYCL 1.2.1 specification: [www.khronos.org/registry/SYCL/specs/sycl-1.2.1.pdf](https://www.khronos.org/registry/SYCL/specs/sycl-1.2.1.pdf) +DPC++ specification: +[https://spec.oneapi.com/versions/latest/elements/dpcpp/source/index.html](https://spec.oneapi.com/versions/latest/elements/dpcpp/source/index.html) +SYCL\* 1.2.1 specification: +[www.khronos.org/registry/SYCL/specs/sycl-1.2.1.pdf](https://www.khronos.org/registry/SYCL/specs/sycl-1.2.1.pdf) + + +\*Other names and brands may be claimed as the property of others. + diff --git a/sycl/doc/SYCLPluginInterface.md b/sycl/doc/PluginInterface.md similarity index 81% rename from sycl/doc/SYCLPluginInterface.md rename to sycl/doc/PluginInterface.md index 8b0264a2d3e13..4c0f18982ea5b 100644 --- a/sycl/doc/SYCLPluginInterface.md +++ b/sycl/doc/PluginInterface.md @@ -1,24 +1,24 @@ -# The SYCL Runtime Plugin Interface. +# The DPC++ Runtime Plugin Interface. ## Overview -The SYCL Runtime Plugin Interface (PI) is the interface layer between -device-agnostic part of the SYCL runtime and the device-specific runtime layers +The DPC++ Runtime Plugin Interface (PI) is the interface layer between +device-agnostic part of the DPC++ runtime and the device-specific runtime layers which control execution on devices. It employs the “plugin” mechanism to bind to the device specific runtime layers similarly to what is used by libomptarget or OpenCL. -The picture below illustrates the placement of the PI within the overall SYCL +The picture below illustrates the placement of the PI within the overall DPC++ runtime stack. Dotted lines show components or paths which are not yet available in the runtime, but are likely to be developed. -![PI in SYCL runtime architecture](images/SYCL_RT_arch.svg) +![PI in DPC++ runtime architecture](images/RuntimeArchitecture.svg) The plugin interface and the discovery process behind it allows to dynamically plug in implementations based on OpenCL and “native” runtime for a particular device – such as OpenCL for FPGA devices or native runtimes for GPUs. Implementations of the PI are “plugins” - dynamic libraries or shared objects which expose a number of entry -points implementing the PI interface. The SYCL runtime collects those function +points implementing the PI interface. The DPC++ runtime collects those function pointers into a PI interface dispatch table - one per plugin - and uses this table to dispatch to the device(s) covered by the corresponding plugin. @@ -33,7 +33,7 @@ management. ## Discovery and linkage of PI implementations -![PI implementation discovery](images/SYCL_plugin_discovery.svg) +![PI implementation discovery](images/PluginDiscovery.svg) Device discovery phase enumerates all available devices and their features by querying underlying plugins found in the system. This process is only performed @@ -42,15 +42,15 @@ once before any actual offload is attempted. ### Plugin discovery Plugins are physically dynamic libraries stored somewhere in the system where -the SYCL runtime runs. TBD - design and describe the process in details. +the DPC++ runtime runs. TBD - design and describe the process in details. #### Plugin binary interface TBD - list and describe all the symbols plugin must export in order to be picked -up by the SYCL runtime for offload. +up by the DPC++ runtime for offload. #### OpenCL plugin -OpenCL plugin is a usual plugin from SYCL runtime standpoint, but its loading +OpenCL plugin is a usual plugin from DPC++ runtime standpoint, but its loading and initialization involves a nested discovery process which finds out available OpenCL implementations. They can be installed either in the standard Khronos ICD-compatible way (e.g. listed in files under /etc/OpenCL/vendors on @@ -66,7 +66,7 @@ TBD ## PI API Specification PI interface is logically divided into few subsets: -- **Core API** which must be implemented by all plugins for SYCL runtime to be +- **Core API** which must be implemented by all plugins for DPC++ runtime to be able to operate on the corresponding device. The core API further breaks down into - **OpenCL-based** APIs which have OpenCL origin and semantics @@ -113,10 +113,11 @@ in a data section. ### The Interoperability PI APIs -These are APIs needed to implement SYCL runtime interoperability with underlying -"native" device runtimes such as OpenCL. Currently there are only OpenCL -interoperability APIs, which is to be implemented by the OpenCL PI plugin only. -These APIs match semantics of the corresponding OpenCL APIs exactly. +These are APIs needed to implement DPC++ runtime interoperability with +underlying "native" device runtimes such as OpenCL. Currently there are only +OpenCL interoperability APIs, which is to be implemented by the OpenCL PI +plugin only. These APIs match semantics of the corresponding OpenCL APIs +exactly. For example: ``` @@ -130,7 +131,7 @@ pi_result piclProgramCreateWithSource( ### PI Extension mechanism -TBD This section describes a mechanism for SYCL or other runtimes to detect +TBD This section describes a mechanism for DPC++ or other runtimes to detect availability of and obtain interfaces beyond those defined by the PI dispatch. TBD Add API to query PI version supported by plugin at runtime. diff --git a/sycl/doc/SYCLCompilerUserManual.md b/sycl/doc/UsersManual.md similarity index 98% rename from sycl/doc/SYCLCompilerUserManual.md rename to sycl/doc/UsersManual.md index cc48fb6661468..d1a9dfe073b11 100644 --- a/sycl/doc/SYCLCompilerUserManual.md +++ b/sycl/doc/UsersManual.md @@ -1,6 +1,6 @@ # Overview -The SYCL* Compiler contains many options to generate the desired binaries for +The DPC++ Compiler contains many options to generate the desired binaries for your application. ## SYCL specific command line options @@ -185,7 +185,7 @@ $ clang++ -fsycl-device-only -fno-sycl-use-bitcode sycl-app.cpp -o sycl-app.spv # Static archives with SYCL device code -The SYCL Compiler contains support to create and use static archives that +The DPC++ Compiler contains support to create and use static archives that contain device enabled fat objects. ## Build your objects diff --git a/sycl/doc/Compiler-HLD.svg b/sycl/doc/images/Compiler-HLD.svg similarity index 100% rename from sycl/doc/Compiler-HLD.svg rename to sycl/doc/images/Compiler-HLD.svg diff --git a/sycl/doc/images/SYCL_plugin_discovery.svg b/sycl/doc/images/PluginDiscovery.svg similarity index 100% rename from sycl/doc/images/SYCL_plugin_discovery.svg rename to sycl/doc/images/PluginDiscovery.svg diff --git a/sycl/doc/images/SYCL_RT_arch.svg b/sycl/doc/images/RuntimeArchitecture.svg similarity index 99% rename from sycl/doc/images/SYCL_RT_arch.svg rename to sycl/doc/images/RuntimeArchitecture.svg index fd4c87ccf71ff..9c9e6ce472a8e 100644 --- a/sycl/doc/images/SYCL_RT_arch.svg +++ b/sycl/doc/images/RuntimeArchitecture.svg @@ -37,14 +37,14 @@ guidetolerance="10" inkscape:pageopacity="0" inkscape:pageshadow="2" - inkscape:window-width="1842" - inkscape:window-height="1177" + inkscape:window-width="1920" + inkscape:window-height="1137" id="namedview5075" showgrid="false" inkscape:zoom="0.97439131" inkscape:cx="349.3594" inkscape:cy="393.64385" - inkscape:window-x="1990" + inkscape:window-x="1912" inkscape:window-y="-8" inkscape:window-maximized="1" inkscape:current-layer="svg5073" /> @@ -81,7 +81,7 @@ id="tspan4647" style="stroke-width:1.31121552">SYCL + id="tspan4645">DPC++ SYCL + id="tspan4659">DPC++ SYCL runtime library + id="tspan4715">DPC++ runtime library SYCL Runtime Plugin Interface (PI) + id="tspan4903">DPC++ Runtime Plugin Interface (PI)