Skip to content

[SYCL][Fusion] Interface with kernel fusion JIT #7831

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 10 additions & 1 deletion buildbot/configure.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,11 @@ def do_configure(args):
libclc_amd_target_names = ';amdgcn--;amdgcn--amdhsa'
libclc_nvidia_target_names = ';nvptx64--;nvptx64--nvidiacl'

sycl_enable_fusion = "OFF"
if not args.disable_fusion:
llvm_external_projects += ";sycl-fusion"
sycl_enable_fusion = "ON"

if args.llvm_external_projects:
llvm_external_projects += ";" + args.llvm_external_projects.replace(",", ";")

Expand All @@ -32,6 +37,7 @@ def do_configure(args):
xpti_dir = os.path.join(abs_src_dir, "xpti")
xptifw_dir = os.path.join(abs_src_dir, "xptifw")
libdevice_dir = os.path.join(abs_src_dir, "libdevice")
fusion_dir = os.path.join(abs_src_dir, "sycl-fusion")
llvm_targets_to_build = args.host_target
llvm_enable_projects = 'clang;' + llvm_external_projects
libclc_targets_to_build = ''
Expand Down Expand Up @@ -144,6 +150,7 @@ def do_configure(args):
"-DXPTI_SOURCE_DIR={}".format(xpti_dir),
"-DLLVM_EXTERNAL_XPTIFW_SOURCE_DIR={}".format(xptifw_dir),
"-DLLVM_EXTERNAL_LIBDEVICE_SOURCE_DIR={}".format(libdevice_dir),
"-DLLVM_EXTERNAL_SYCL_FUSION_SOURCE_DIR={}".format(fusion_dir),
"-DLLVM_ENABLE_PROJECTS={}".format(llvm_enable_projects),
"-DLIBCLC_TARGETS_TO_BUILD={}".format(libclc_targets_to_build),
"-DLIBCLC_GENERATE_REMANGLED_VARIANTS={}".format(libclc_gen_remangled_variants),
Expand All @@ -159,7 +166,8 @@ def do_configure(args):
"-DLLVM_ENABLE_LLD={}".format(llvm_enable_lld),
"-DXPTI_ENABLE_WERROR={}".format(xpti_enable_werror),
"-DSYCL_CLANG_EXTRA_FLAGS={}".format(sycl_clang_extra_flags),
"-DSYCL_ENABLE_PLUGINS={}".format(';'.join(set(sycl_enabled_plugins)))
"-DSYCL_ENABLE_PLUGINS={}".format(';'.join(set(sycl_enabled_plugins))),
"-DSYCL_ENABLE_KERNEL_FUSION={}".format(sycl_enable_fusion)
]

if args.l0_headers and args.l0_loader:
Expand Down Expand Up @@ -238,6 +246,7 @@ def main():
parser.add_argument("--llvm-external-projects", help="Add external projects to build. Add as comma seperated list.")
parser.add_argument("--ci-defaults", action="store_true", help="Enable default CI parameters")
parser.add_argument("--enable-plugin", action='append', help="Enable SYCL plugin")
parser.add_argument("--disable-fusion", action="store_true", help="Disable the kernel fusion JIT compiler")
args = parser.parse_args()

print("args:{}".format(args))
Expand Down
10 changes: 7 additions & 3 deletions sycl-fusion/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,10 @@ set(SYCL_JIT_BASE_DIR ${CMAKE_CURRENT_SOURCE_DIR})
# directories, similar to how clang/CMakeLists.txt does it.
set(LLVM_SPIRV_INCLUDE_DIRS "${LLVM_MAIN_SRC_DIR}/../llvm-spirv/include")

add_subdirectory(jit-compiler)
add_subdirectory(passes)
add_subdirectory(test)
if(WIN32)
message(WARNING "Kernel fusion not yet supported on Windows")
else(WIN32)
add_subdirectory(jit-compiler)
add_subdirectory(passes)
add_subdirectory(test)
Comment on lines +12 to +17
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sommerlukas, unfortunately, compilation of jit-compiler with clang (XCode compiler to be precise) fails.

See https://github.com/intel/llvm/actions/runs/3910984924/jobs/6683827601.

/Users/runner/work/llvm/llvm/src/llvm/include/llvm/Support/Casting.h:280:45: error: unused parameter 'f' [-Werror,-Wunused-parameter]
  static inline bool isPossible(const From &f) { return true; }
                                            ^
/Users/runner/work/llvm/llvm/src/llvm/include/llvm/Support/Casting.h:604:41: error: unused parameter 't' [-Werror,-Wunused-parameter]
  static inline bool isPresent(const T &t) { return true; }
                                        ^
2 errors generated.

It's a bit surprising, but it looks like LLVM is less strict on enforcing "no warnings" policy.
I think we should avoid mis-compilations due to errors in external code like this.
To be honest, I'm sure how did we end-up with enabling -Werror for jit-compiler project. I know that we enabled it for sycl project by setting SYCL_ENABLE_WERROR CMake variable, but it should not impact jit-compiler. Could you investigate, please?

Copy link
Contributor Author

@sommerlukas sommerlukas Jan 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @bader,

thanks for reporting this.

The file which causes this error is jit_compiler.cpp, which is not part of sycl-fusion (i.e., the JIT compiler), but sycl itself (lives in sycl/source/detail) and connects the SYCL runtime to the JIT compiler. As it is compiled as part of the SYCL runtime, I think it is expected that it is compiled with -Werror if SYCL_ENABLE_WERROR is set.

The warnings (which then become errors with -Werror) are caused by including LLVMContext.h in JITContext.h, which in turn is included by jit_compiler.cpp. I'll look into how we can avoid including LLVMContext.h here by using an opaque pointer or similar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bader: I've opened #8017 to address this issue.

With the fix from this PR, I'm able to compile with SYCL_ENABLE_WERROR=ON in my local build (Linux, Clang 10).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm able to compile with SYCL_ENABLE_WERROR=ON in my local build (Linux, Clang 10).

Were you able to reproduce the problem before the fix with your local build (i.e. Linux, Clang 10)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, with SYCL_ENABLE_WERROR=ON, my local build failed with the same error message before implementing this fix.

endif(WIN32)
9 changes: 9 additions & 0 deletions sycl-fusion/jit-compiler/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,15 @@ target_link_libraries(sycl-fusion
${CMAKE_THREAD_LIBS_INIT}
)

if(NOT MSVC AND NOT APPLE)
# Manage symbol visibility through the linker to make sure no LLVM symbols
# are exported and confuse the drivers.
set(linker_script "${CMAKE_CURRENT_SOURCE_DIR}/ld-version-script.txt")
target_link_libraries(
sycl-fusion PRIVATE "-Wl,--version-script=${linker_script}")
set_target_properties(sycl-fusion PROPERTIES LINK_DEPENDS ${linker_script})
endif()

install(TARGETS sycl-fusion
LIBRARY DESTINATION "lib${LLVM_LIBDIR_SUFFIX}" COMPONENT sycl-fusion
RUNTIME DESTINATION "bin" COMPONENT sycl-fusion)
8 changes: 8 additions & 0 deletions sycl-fusion/jit-compiler/ld-version-script.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
global:
/* Export everything from jit_compiler namespace */
_ZN12jit_compiler*;

local:
*;
};
6 changes: 3 additions & 3 deletions sycl-fusion/jit-compiler/lib/fusion/FusionPipeline.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ FusionPipeline::runFusionPasses(Module &Mod, SYCLModuleInfo &InputInfo,
FPM.addPass(createFunctionToLoopPassAdaptor(IndVarSimplifyPass{}));
LoopUnrollOptions UnrollOptions;
FPM.addPass(LoopUnrollPass{UnrollOptions});
FPM.addPass(SROAPass{});
FPM.addPass(SROAPass{SROAOptions::ModifyCFG});
// Run the InferAddressSpace pass to remove as many address-space casts
// to/from generic address-space as possible, because these hinder
// internalization.
Expand All @@ -94,11 +94,11 @@ FusionPipeline::runFusionPasses(Module &Mod, SYCLModuleInfo &InputInfo,
// Run additional optimization passes after completing fusion.
{
FunctionPassManager FPM;
FPM.addPass(SROAPass{});
FPM.addPass(SROAPass{SROAOptions::ModifyCFG});
FPM.addPass(SCCPPass{});
FPM.addPass(InstCombinePass{});
FPM.addPass(SimplifyCFGPass{});
FPM.addPass(SROAPass{});
FPM.addPass(SROAPass{SROAOptions::ModifyCFG});
FPM.addPass(InstCombinePass{});
FPM.addPass(SimplifyCFGPass{});
FPM.addPass(ADCEPass{});
Expand Down
7 changes: 6 additions & 1 deletion sycl/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,12 @@ install(DIRECTORY ${OpenCL_INCLUDE_DIR}/CL
COMPONENT OpenCL-Headers)

# Option to enable online kernel fusion via a JIT compiler
option(SYCL_ENABLE_KERNEL_FUSION "Enable kernel fusion via JIT compiler" OFF)
option(SYCL_ENABLE_KERNEL_FUSION "Enable kernel fusion via JIT compiler" ON)
if(SYCL_ENABLE_KERNEL_FUSION AND WIN32)
message(WARNING "Kernel fusion not yet supported on Windows")
set(SYCL_ENABLE_KERNEL_FUSION OFF CACHE
BOOL "Kernel fusion not yet supported on Windows" FORCE)
endif()

# Needed for feature_test.hpp
if ("cuda" IN_LIST SYCL_ENABLE_PLUGINS)
Expand Down
5 changes: 5 additions & 0 deletions sycl/cmake/modules/AddSYCLUnitTest.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,11 @@ macro(add_sycl_unittest test_dirname link_variant)
OpenCL-Headers
${SYCL_LINK_LIBS}
)

if(SYCL_ENABLE_KERNEL_FUSION)
target_link_libraries(${test_dirname} PRIVATE sycl-fusion)
endif(SYCL_ENABLE_KERNEL_FUSION)

target_include_directories(${test_dirname}
PRIVATE SYSTEM
${sycl_inc_dir}
Expand Down
11 changes: 11 additions & 0 deletions sycl/doc/GetStartedGuide.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ and a wide range of compute accelerators such as GPU and FPGA.
- [Build DPC++ toolchain with support for HIP AMD](#build-dpc-toolchain-with-support-for-hip-amd)
- [Build DPC++ toolchain with support for HIP NVIDIA](#build-dpc-toolchain-with-support-for-hip-nvidia)
- [Build DPC++ toolchain with support for ESIMD CPU Emulation](#build-dpc-toolchain-with-support-for-esimd-emulator)
- [Build DPC++ toolchain with support for runtime kernel fusion](#build-dpc-toolchain-with-support-for-runtime-kernel-fusion)
- [Build Doxygen documentation](#build-doxygen-documentation)
- [Deployment](#deployment)
- [Use DPC++ toolchain](#use-dpc-toolchain)
Expand Down Expand Up @@ -298,6 +299,16 @@ Enabling this flag requires following packages installed.
Currently, this feature was tested and verified on Ubuntu 20.04
environment.

### Build DPC++ toolchain with support for runtime kernel fusion

Support for the experimental SYCL extension for user-driven kernel fusion
at runtime is enabled by default.

To disable support for this feature, follow the instructions for the
Linux DPC++ toolchain, but add the `--disable-fusion` flag.

Kernel fusion is currently not yet supported on the Windows platform.

### Build Doxygen documentation

Building Doxygen documentation is similar to building the product itself. First,
Expand Down
4 changes: 4 additions & 0 deletions sycl/include/sycl/detail/cg.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,10 @@ class CG {

CGTYPE getType() { return MType; }

std::vector<std::vector<char>> &getArgsStorage() { return MArgsStorage; }

std::vector<detail::AccessorImplPtr> &getAccStorage() { return MAccStorage; }

virtual ~CG() = default;

private:
Expand Down
9 changes: 9 additions & 0 deletions sycl/source/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,13 @@ function(add_sycl_rt_library LIB_NAME LIB_OBJ_NAME)
PRIVATE OpenCL-Headers
)

if(SYCL_ENABLE_KERNEL_FUSION)
target_link_libraries(${LIB_NAME} PRIVATE sycl-fusion)
target_link_libraries(${LIB_OBJ_NAME} PRIVATE sycl-fusion)
set_property(GLOBAL APPEND PROPERTY SYCL_TOOLCHAIN_INSTALL_COMPONENTS
sycl-fusion)
endif(SYCL_ENABLE_KERNEL_FUSION)

find_package(Threads REQUIRED)

target_link_libraries(${LIB_NAME}
Expand Down Expand Up @@ -139,6 +146,8 @@ set(SYCL_SOURCES
"detail/handler_proxy.cpp"
"detail/image_accessor_util.cpp"
"detail/image_impl.cpp"
"detail/jit_compiler.cpp"
"detail/jit_device_binaries.cpp"
"detail/kernel_impl.cpp"
"detail/kernel_program_cache.cpp"
"detail/memory_manager.cpp"
Expand Down
Loading