Skip to content

Guard SYCL Graph implementation and fallback emulation #71

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Mar 30, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 21 additions & 53 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,72 +1,44 @@
# Intel Project for LLVM\* technology
# SYCL Command Graph Extensions

This is the Intel staging area for llvm.org contributions and the home for
Intel LLVM-based projects:
This is the collaboration space for the oneAPI vendor Command Graph extension for SYCL2020. It provides an API for defining a graph of operations and their dependencies once and submitting this graph repeatedly for execution.

- [oneAPI DPC++ compiler](#oneapi-dpc-compiler)
- [Late-outline OpenMP and OpenMP Offload](#late-outline-openmp-and-openmp-offload)
### Specification

## oneAPI DPC++ compiler
A draft of our Command Graph extension proposal can be found here:
[https://github.com/intel/llvm/pull/5626](https://github.com/intel/llvm/pull/5626).

[![](https://spec.oneapi.io/oneapi-logo-white-scaled.jpg)](https://www.oneapi.io/)
### Implementation

[![SYCL Post Commit](https://github.com/intel/llvm/actions/workflows/sycl_post_commit.yml/badge.svg?branch=sycl)](https://github.com/intel/llvm/actions/workflows/sycl_post_commit.yml)
[![Generate Doxygen documentation](https://github.com/intel/llvm/actions/workflows/gh_pages.yml/badge.svg?branch=sycl)](https://github.com/intel/llvm/actions/workflows/gh_pages.yml)
Our current prototype implementation can be found here:
[https://github.com/reble/llvm/tree/sycl-graph-develop](https://github.com/reble/llvm/tree/sycl-graph-develop).

The DPC++ is a LLVM-based compiler project that implements compiler and runtime
support for the SYCL\* language. The project is hosted in the
[sycl](/../../tree/sycl) branch and is synced with the tip of the LLVM upstream
main branch on a regular basis (revisions delay is usually not more than 1-2
weeks). DPC++ compiler takes everything from LLVM upstream as is, however some
modules of LLVM might be not included in the default project build
configuration. Additional modules can be enabled by modifying build framework
settings.
Limitations include:
* LevelZero backend support only.
* Accessors and reductions are currently not supported.

The DPC++ goal is to support the latest SYCL\* standard and work on that is in
progress. DPC++ also implements a number of extensions to the SYCL\* standard,
which can be found in the [sycl/doc/extensions](/../sycl/sycl/doc/extensions)
directory.
### Other Material

The main purpose of this project is open source collaboration on the DPC++
compiler implementation in LLVM across a variety of architectures, prototyping
compiler and runtime library solutions, designing future extensions, and
conducting experiments. As the implementation becomes more mature, we try to
upstream as much DPC++ support to LLVM main branch as possible. See
[SYCL upstreaming working group notes](/../../wiki/SYCL-upstreaming-working-group-meeting-notes)
for more details.
This extension was presented at the oneAPI Technical Advisory board (Sept'22 meeting). Slides: [https://github.com/oneapi-src/oneAPI-tab/blob/main/language/presentations/2022-09-28-TAB-SYCL-Graph.pdf](https://github.com/oneapi-src/oneAPI-tab/blob/main/language/presentations/2022-09-28-TAB-SYCL-Graph.pdf).

Note that this project can be used as a technical foundation for some
proprietary compiler products, which may leverage implementations from this open
source project. One of the examples is
[Intel(R) oneAPI DPC++ Compiler](https://www.intel.com/content/www/us/en/developer/tools/oneapi/dpc-compiler.html)
Features parity between this project and downstream projects is not guaranteed.
## Intel Project for LLVM\* technology

Project documentation is available at:
[DPC++ Documentation](https://intel.github.io/llvm-docs/).
We target a contribution through the origin of this fork: [Intel staging area for llvm.org contributions](https://github.com/intel/llvm).

### How to use DPC++

#### Docker containers

See available containers with pre-built/pre-installed DPC++ compiler at:
[Containers](/../sycl/sycl/doc/developer/DockerBKMs.md#sycl-containers-overview)

#### Releases

Daily builds of the sycl branch on Linux are available at
[releases](/../../releases).
A few times a year, we publish [Release Notes](/../sycl/sycl/ReleaseNotes.md) to
highlight all important changes made in the project: features implemented and
issues addressed. The corresponding builds can be found using
[search](https://github.com/intel/llvm/releases?q=oneAPI+DPC%2B%2B+Compiler&expanded=true)
in daily releases. None of the branches in the project are stable or rigorously
tested for production quality control, so the quality of these releases is
expected to be similar to the daily releases.
TDB

#### Build from sources

See [Get Started Guide](/../sycl/sycl/doc/GetStartedGuide.md).

SYCL Graph support is enabled with:
* Configuration script: `configure.py -enable-sycl-graph`.
* CMake: `cmake -DSYCL_ENABLE_GRAPH`.

A fallback emulation mode is used otherwise that enables the graph API but eagerly submits kernels.
### Report a problem

Submit an [issue](/../../issues) or initiate a [discussion](/../../discussions).
Expand All @@ -75,10 +47,6 @@ Submit an [issue](/../../issues) or initiate a [discussion](/../../discussions).

See [ContributeToDPCPP](/../sycl/sycl/doc/developer/ContributeToDPCPP.md).

## Late-outline OpenMP\* and OpenMP\* Offload

See [openmp](/../../tree/openmp) branch.

# License

See [LICENSE](/../sycl/sycl/LICENSE.TXT) for details.
Expand Down
7 changes: 7 additions & 0 deletions buildbot/configure.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ def do_configure(args):
sycl_enable_xpti_tracing = 'ON'
xpti_enable_werror = 'OFF'

sycl_enable_graph = 'OFF'

# lld is needed on Windows or for the HIP plugin on AMD
if platform.system() == 'Windows' or (args.hip and args.hip_platform == 'AMD'):
llvm_enable_projects += ';lld'
Expand Down Expand Up @@ -94,6 +96,9 @@ def do_configure(args):

if args.use_lld:
llvm_enable_lld = 'ON'

if args.enable_sycl_graph:
sycl_enable_graph = 'ON'

# CI Default conditionally appends to options, keep it at the bottom of
# args handling
Expand Down Expand Up @@ -147,6 +152,7 @@ def do_configure(args):
"-DLLVM_ENABLE_SPHINX={}".format(llvm_enable_sphinx),
"-DBUILD_SHARED_LIBS={}".format(llvm_build_shared_libs),
"-DSYCL_ENABLE_XPTI_TRACING={}".format(sycl_enable_xpti_tracing),
"-DSYCL_ENABLE_GRAPH={}".format(sycl_enable_graph),
"-DLLVM_ENABLE_LLD={}".format(llvm_enable_lld),
"-DXPTI_ENABLE_WERROR={}".format(xpti_enable_werror),
"-DSYCL_CLANG_EXTRA_FLAGS={}".format(sycl_clang_extra_flags),
Expand Down Expand Up @@ -216,6 +222,7 @@ def main():
help="host LLVM target architecture, defaults to X86, multiple targets may be provided as a semi-colon separated string")
parser.add_argument("--enable-esimd-emulator", action='store_true', help="build with ESIMD emulation support")
parser.add_argument("--enable-all-llvm-targets", action='store_true', help="build compiler with all supported targets, it doesn't change runtime build")
parser.add_argument("--enable-sycl-graph", action='store_true', help="build with SYCL Graph support")
parser.add_argument("--no-assertions", action='store_true', help="build without assertions")
parser.add_argument("--docs", action='store_true', help="build Doxygen documentation")
parser.add_argument("--werror", action='store_true', help="Treat warnings as errors")
Expand Down
7 changes: 7 additions & 0 deletions sycl/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,13 @@ endif()
# of the SYCL runtime and expect enabling
option(SYCL_ENABLE_XPTI_TRACING "Enable tracing of SYCL constructs" OFF)

# Create a soft option for enabling or disabling the experimental support
# for SYCl Graph
option(SYCL_ENABLE_GRAPH "Enable experimental SYCL Graph support" OFF)
if (SYCL_ENABLE_GRAPH)
set(SYCL_BUILD_SYCL_GRAPH ON)
endif()

if(MSVC)
set_property(GLOBAL PROPERTY USE_FOLDERS ON)
# Skip asynchronous C++ exceptions catching and assume "extern C" functions
Expand Down
1 change: 1 addition & 0 deletions sycl/doc/GetStartedGuide.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,7 @@ flags can be found by launching the script with `--help`):
* `--enable-esimd-emulator` -> enable ESIMD CPU emulation (see [ESIMD CPU emulation](#build-dpc-toolchain-with-support-for-esimd-cpu))
* `--enable-all-llvm-targets` -> build compiler (but not a runtime) with all
supported targets
* `--enable-sycl-graph` -> build SYCL Graph support
* `--shared-libs` -> Build shared libraries
* `-t` -> Build type (Debug or Release)
* `-o` -> Path to build directory
Expand Down
4 changes: 4 additions & 0 deletions sycl/include/sycl/feature_test.hpp.in
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,10 @@ __SYCL_INLINE_VER_NAMESPACE(_V1) {
#if SYCL_BUILD_PI_HIP
#define SYCL_EXT_ONEAPI_BACKEND_HIP 1
#endif
#cmakedefine01 SYCL_BUILD_SYCL_GRAPH
#if SYCL_BUILD_SYCL_GRAPH
#define SYCL_EXT_ONEAPI_GRAPH 1
#endif

} // __SYCL_INLINE_VER_NAMESPACE(_V1)
} // namespace sycl
11 changes: 10 additions & 1 deletion sycl/plugins/level_zero/pi_level_zero.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
#include <sstream>
#include <string>
#include <sycl/detail/spinlock.hpp>
#include <sycl/feature_test.hpp>
#include <thread>
#include <utility>

Expand Down Expand Up @@ -1306,7 +1307,8 @@ pi_result resetCommandLists(pi_queue Queue) {
pi_result _pi_context::getAvailableCommandList(
pi_queue Queue, pi_command_list_ptr_t &CommandList, bool UseCopyEngine,
bool AllowBatching, ze_command_queue_handle_t *ForcedCmdQueue) {


#if SYCL_EXT_ONEAPI_GRAPH
// This is a hack. TODO: Proper CommandList allocation per Executable Graph.
if( Queue->Properties & PI_EXT_ONEAPI_QUEUE_LAZY_EXECUTION ) {
// TODO: Create new Command List.
Expand Down Expand Up @@ -1346,6 +1348,7 @@ pi_result _pi_context::getAvailableCommandList(
}
return PI_SUCCESS;
}
#endif

// Immediate commandlists have been pre-allocated and are always available.
if (Queue->Device->useImmediateCommandLists()) {
Expand Down Expand Up @@ -1587,8 +1590,10 @@ pi_result _pi_queue::executeCommandList(pi_command_list_ptr_t CommandList,
bool OKToBatchCommand) {
// When executing a Graph, defer execution if this is a command
// which could be batched (i.e. likely a kernel submission)
#if SYCL_EXT_ONEAPI_GRAPH
if (this->Properties & PI_EXT_ONEAPI_QUEUE_LAZY_EXECUTION && OKToBatchCommand)
return PI_SUCCESS;
#endif

bool UseCopyEngine = CommandList->second.isCopy(this);

Expand Down Expand Up @@ -3830,6 +3835,7 @@ pi_result piQueueFinish(pi_queue Queue) {
// Flushing cross-queue dependencies is covered by createAndRetainPiZeEventList,
// so this can be left as a no-op.
pi_result piQueueFlush(pi_queue Queue) {
#if SYCL_EXT_ONEAPI_GRAPH
if( Queue->Properties & PI_EXT_ONEAPI_QUEUE_LAZY_EXECUTION ) {

pi_command_list_ptr_t CommandList{};
Expand All @@ -3838,6 +3844,9 @@ pi_result piQueueFlush(pi_queue Queue) {

Queue->executeCommandList(CommandList, false, false);
}
#else
(void)Queue;
#endif
return PI_SUCCESS;
}

Expand Down
5 changes: 5 additions & 0 deletions sycl/source/detail/graph_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
#include <detail/queue_impl.hpp>
#include <detail/scheduler/commands.hpp>
#include <sycl/queue.hpp>
#include <sycl/feature_test.hpp>

namespace sycl {
__SYCL_INLINE_VER_NAMESPACE(_V1) {
Expand All @@ -35,10 +36,14 @@ void graph_impl::exec_and_wait(
if (!IsSubGraph) {
Queue->setIsGraphSubmitting(true);
}
#if SYCL_EXT_ONEAPI_GRAPH
if (MFirst) {
exec(Queue);
MFirst = false;
}
#else
exec(Queue);
#endif
if (!IsSubGraph) {
Queue->setIsGraphSubmitting(false);
Queue->wait();
Expand Down
3 changes: 3 additions & 0 deletions sycl/source/detail/queue_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
#include <sycl/context.hpp>
#include <sycl/detail/pi.hpp>
#include <sycl/device.hpp>
#include <sycl/feature_test.hpp>

#include <cstring>
#include <utility>
Expand Down Expand Up @@ -278,11 +279,13 @@ void queue_impl::wait(const detail::code_location &CodeLoc) {
TelemetryEvent = instrumentationProlog(CodeLoc, Name, StreamID, IId);
#endif

#if SYCL_EXT_ONEAPI_GRAPH
if (has_property<ext::oneapi::property::queue::lazy_execution>()) {
const detail::plugin &Plugin = getPlugin();
if (Plugin.getBackend() == backend::ext_oneapi_level_zero)
Plugin.call<detail::PiApiKind::piQueueFlush>(getHandleRef());
}
#endif

std::vector<std::weak_ptr<event_impl>> WeakEvents;
std::vector<event> SharedEvents;
Expand Down