|
| 1 | +# March'20 release notes |
| 2 | + |
| 3 | +Release notes for the commit range e8f1f29..ba404be |
| 4 | + |
| 5 | +## New features |
| 6 | + - Initial CUDA backend support [7a9a425] |
| 7 | + - [SYCL][FPGA] Implement IO pipes interface [c900248] |
| 8 | + - Added the implementation of [GroupAlgorithms extension](doc/extensions/GroupAlgorithms/SYCL_INTEL_group_algorithms.asciidoc) |
| 9 | + [8bfa107] |
| 10 | + - Added a partial implementation of [sub group algorithms extension](doc/extensions/SubGroupAlgorithms/SYCL_INTEL_sub_group_algorithms.asciidoc) |
| 11 | + [017af4e] |
| 12 | + - New attributes for Intel FPGA devices: `intelfpga::force_pow2_depth`, |
| 13 | + `intelfpga::loop_coalesce`, `intelfpga::speculated_iterations`, |
| 14 | + `intelfpga::disable_loop_pipelining`, `intelfpga::max_interleaving` |
| 15 | + [73dd705][a5b9804] |
| 16 | + - Added support for `intel::reqd_work_group_size` attribute [8eb588d] |
| 17 | + - Added support for specialization constants feature which is based on |
| 18 | + [SYCL Specialization Constant proposal](https://github.com/codeplaysoftware/standards-proposals/blob/master/spec-constant/index.md) [29abe37] |
| 19 | + |
| 20 | +## Improvements |
| 21 | +### SYCL Frontend and driver changes |
| 22 | + - Added a diagnostic on attempt to declare or use non-const static variable |
| 23 | + inside device code [7743e86] [1853516] |
| 24 | + - Relaxed requirements for kernel types even more. Now by default they should |
| 25 | + have trivial copy constructor and trivial destructor [17aac3c] |
| 26 | + - Changed `std::numeric_limits<sycl::half>` to constexpr functions [85d7a5e] |
| 27 | + - Added a diagnostic on attempt to use zero length arrays inside device code |
| 28 | + [e6ce614] |
| 29 | + - Added support for math functions 'fabs' and 'ceil' in device code [f41309b] |
| 30 | + - Added a diagnostic (warning) on attempt to append new device object to |
| 31 | + an archive which already contains an AOT-compiled device object [9d348eb] |
| 32 | + - Added a diagnostic on attempt to use functions which have no definition in |
| 33 | + the TU and are not marked with `SYCL_EXTERNAL` macro inside device code |
| 34 | + [a3b340b] |
| 35 | + - Added a diagnostic on attempt to use thread local storage inside device code |
| 36 | + [eb373c4] |
| 37 | + - Removed arch designator from the default output file name when compiling |
| 38 | + with `-fsycl-link` option. Now an output file has just a flat name based on |
| 39 | + the first input file [dc729a7] |
| 40 | + - The SYCL headers were moved from `lib/clang/11.0.0/include` to |
| 41 | + `include/sycl` to support mixed compilers [39501f6] |
| 42 | + - Added support for the GCC style inline assembly in the device code [6f4e007] |
| 43 | + - Improved fat static library support: the driver now consider for offloading |
| 44 | + static libraries which are passed on the command line as well as libraries |
| 45 | + passed as part of the linker options. This effectively negates the need to |
| 46 | + use `-foffload-static-lib` and `-foffload-whole-static-lib` options which |
| 47 | + are deprecated now. |
| 48 | + - The `SYCL_EXTERNAL` macro is now allowed to be used with class member |
| 49 | + functions [3baec18] |
| 50 | + - Set `aux-target-cpu` for the device compilation which sets AVX and other |
| 51 | + necessary macro based on a target [f953fda] |
| 52 | + |
| 53 | +### SYCL headers and runtime |
| 54 | + - Changed `sycl::context` and `sycl::queue` constructors to be explicit to |
| 55 | + avoid unintended conversions [c220eb8][3b6799a] |
| 56 | + - Added a diagnostic on setting `SYCL_DEVICE_TYPE` environment variable to an |
| 57 | + incorrect value [0125496] |
| 58 | + - Improved error codes which are encoded in the SYCL exceptions [04ee17c] |
| 59 | + - Removed functions that use float type in the fallback library for fp64 |
| 60 | + complex [6ccd84a0] |
| 61 | + - Added support for `RESTRICT_WRITE_ACCESS_TO_CONSTANT_PTR` macro which allows |
| 62 | + to enable diagnostic on writing to a raw pointer obtained from a |
| 63 | + `sycl::constant_ptr` object [c9ed5b2] |
| 64 | + - Added support for USM extension for CUDA backend [498d56c] |
| 65 | + |
| 66 | +### Documentation |
| 67 | + - Refactored [USM specification](doc/extensions/USM/USM.adoc) [0438422] |
| 68 | + - Added [GroupAlgorithms extensions](doc/extensions/GroupAlgorithms/) |
| 69 | + as replacement of GroupCollectives extension [c181fdb][b18a566] |
| 70 | + - Doxygen documentation is now rendered to GitHub Pages. An initial |
| 71 | + implementation is available [online](https://intel.github.io/llvm-docs/doxygen/annotated.html) |
| 72 | + [29d9cc2] |
| 73 | + - More details have been added about the `-fintelfpga` option in the |
| 74 | + [Compiler User Manual](doc/SYCLCompilerUserManual.md) [4b03ddb] |
| 75 | + - Added [SYCL_INTEL_enqueue_barrier extension document](doc/extensions/EnqueueBarrier/enqueue_barrier.asciidoc) |
| 76 | + [6cfd2cb] |
| 77 | + - Added [standard layout relaxation extension](doc/extensions/RelaxStdLayout/SYCL_INTEL_relax_standard_layout.asciidoc) |
| 78 | + [ce53521] |
| 79 | + - Deprecated SubGroupNDRange extension [d9b178f] |
| 80 | + - Added extension for base sub-group class: |
| 81 | + [SubGroup](doc/extensions/SubGroup/SYCL_INTEL_sub_group.asciidoc) [d9b178f] |
| 82 | + - Added extension for functions operating on sub-groups: |
| 83 | + [SubGroupAlgorithms](doc/extensions/SubGroupAlgorithms/SYCL_INTEL_sub_group_algorithms.asciidoc) |
| 84 | + [d9b178f] |
| 85 | + - Added extension introducing group masks and ballot functionality: |
| 86 | + [GroupMask](doc/extensions/GroupMask/SYCL_INTEL_group_mask.asciidoc) |
| 87 | + [d9b178f] |
| 88 | + - The project has been renamed to "oneAPI DPC++ Compiler", all documentation |
| 89 | + has been fixed accordingly [7a2e75e] |
| 90 | + |
| 91 | +## Bug fixes |
| 92 | +### SYCL Frontend and driver changes |
| 93 | + - Fixed a problem with compiler not being able to find a dependency file when |
| 94 | + compiling AOT to an object for FPGA [7b58b01] |
| 95 | + - Fixed a problem with host object not being added to the partial link step |
| 96 | + when compiling from source and using `-foffload-static-lib` option [1a951cb] |
| 97 | + - Reversed `reqd_work_group_size` attribute to match SYCL behavior [1da6fbe] |
| 98 | + - Fixed dependency output location when `/Fo<dir>` is given [2b6f4f4] |
| 99 | + - Fixed a crash which happened when no kernel name is passed to the |
| 100 | + `sycl::handler::parallel_for` [fadaa59] |
| 101 | + |
| 102 | +### SYCL headers and runtime |
| 103 | + - Fixed `sycl::queue::wait()` which was not waiting for event associated with |
| 104 | + USM operation [850fb9f] |
| 105 | + - Fixed problem with reporting wrong error message on the second attempt to |
| 106 | + build program if the first attempt failed [9a34a11] |
| 107 | + - Fixed an issue which could happen when `sycl::event::wait` is called from |
| 108 | + multiple threads [3da5473] |
| 109 | + - Aligned `sub_group::store` signature between host and device [b3a9426] |
| 110 | + - Fixed `sycl::program::get_compile_options` and |
| 111 | + `sycl::program::get_build_options` to return correct values [03326f7] |
| 112 | + - Fixed `sycl::multi_ptr`'s methods that were incorrectly enabled/disabled on |
| 113 | + device/host [401d174] |
| 114 | + - Fixed incorrect dependency handling when creating sub-buffers which could |
| 115 | + lead to data races [45e39bd] |
| 116 | + - Reversed reported max work-group size for a device to align with work-group |
| 117 | + size reversing before kernels launch [72b7dee] |
| 118 | + - Fixed incorrect handling of kernels that use hierarchical parallelism when |
| 119 | + `-O0` option is passed to the clang [fd8ae8a] |
| 120 | + - Changed names of SYCL internal variables to avoid conflict with commonly |
| 121 | + used macros: `SUCCESS`, `BLOCKED` and `FAILED` [0f7e361] |
| 122 | + - Fixed a bug when a host device was always included in the device list |
| 123 | + returned by `sycl::device::get_devices` [6cf590f] |
| 124 | + - Fixed a problem with passing `sycl::vec` object to |
| 125 | + `sycl::group::async_work_group_copy` [20aa83e] |
| 126 | + - Fixed behavior of sycl::malloc_shared to return nullptr for the allocation |
| 127 | + size of zero or less byte, and the behavior of sycl::free functions to |
| 128 | + ignore the deallocation request from nullptr [d596593] |
| 129 | + - Fixed a possible problem with selecting work-group size which is bigger than |
| 130 | + max allowed work-group [b48f08f] |
| 131 | + - Fixed an issue which causes errors when using sub-buffers [5d1d716] |
| 132 | + - Changed the implementation of the buffer constructor from a pair of |
| 133 | + iterators. Now, data is not written back to the host on destruction of the |
| 134 | + buffer unless the buffer has a valid non-null pointer specified via the |
| 135 | + member function set_final_data [fb72758] |
| 136 | + - Fixed a problem with incorrect acceptance of a lambda which takes an |
| 137 | + argument of the `sycl::id` type in the `sycl::handler::parallel_for` version |
| 138 | + which takes a `sycl::ndrange` object [0408899] |
| 139 | + - Resolved circular dependency between `sycl::event` and `sycl::queue` |
| 140 | + [8c71dcb] |
| 141 | + |
| 142 | + |
| 143 | +## Known issues |
| 144 | + - The format of the object files produced by the compiler can change between |
| 145 | + versions. The workaround is to rebuild the application. |
| 146 | + - The SYCL library doesn't guarantee stable API/ABI, so applications compiled |
| 147 | + with older version of the SYCL library may not work with new one. |
| 148 | + The workaround is to rebuild the application. |
| 149 | + - Using `cl::sycl::program` API to refer to a kernel defined in another |
| 150 | + translation unit leads to undefined behavior |
| 151 | + - Linkage errors with the following message: |
| 152 | + `error LNK2005: "bool const std::_Is_integral<bool>" (??$_Is_integral@_N@std@@3_NB) already defined` |
| 153 | + can happen when a SYCL application is built using MS Visual Studio 2019 |
| 154 | + version below 16.3.0 |
| 155 | + The workaround is to enable `-std=c++17` for the failing MSVC version. |
| 156 | + |
| 157 | +## Prerequisites |
| 158 | +### Linux |
| 159 | + - Experimental Intel(R) CPU Runtime for OpenCL(TM) Applications with SYCL |
| 160 | + support from the release package https://github.com/intel/llvm/releases/ |
| 161 | + - The latest version of Intel(R) Graphics Compute Runtime for OpenCL(TM) from |
| 162 | + https://github.com/intel/compute-runtime/releases/ |
| 163 | +### Windows |
| 164 | + - Experimental Intel(R) CPU Runtime for OpenCL(TM) Applications with SYCL |
| 165 | + support from the release package https://github.com/intel/llvm/releases/ |
| 166 | + - The latest version of Intel(R) Graphics Compute Runtime for OpenCL(TM) from |
| 167 | + https://downloadcenter.intel.com/ |
| 168 | + |
| 169 | +Please, see the runtime installation guide [here](https://github.com/intel/llvm/blob/sycl/sycl/doc/GetStartedWithSYCLCompiler.md#install-low-level-runtime) |
| 170 | + |
| 171 | + |
| 172 | + |
1 | 173 | # February'20 release notes
|
2 | 174 |
|
3 | 175 | Release notes for commit e8f1f29
|
|
0 commit comments