Skip to content

[SYCL] DPC++ reduction library incorrect event profiling timing #2820

@huanghua1994

Description

@huanghua1994

Test file: https://github.com/huanghua1994/HPC_Playground/blob/master/SYCL/reduction_timing.cpp
Compiler version: git commit 140c0d0
Compiler configuration: buildbot/configure.py --cuda
Selected device: GTX 1070, CUDA version 11.0, driver version 455.38

Problem description:
When using the DPC++ reduction library for float type add reduction, info::event_profiling::command_{start/end} returned incorrect timings (too small). For int type add reduction, the timings are correct.

Sample output when using T = float:

$ ./reduction_timing.exe 1048576 128
n = 1048576, b = 128
Runtime with reduction    = 64513 ns
Runtime without reduction = 175105 ns

Sample output when using T = int:

$ ./reduction_timing.exe 1048576 128
n = 1048576, b = 128
Runtime with reduction    = 2096161 ns
Runtime without reduction = 175102 ns

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions