Support std::memcpy or improve detail::memcpy

**Is your feature request related to a problem? Please describe**
As noted in https://github.com/intel/llvm/pull/3815, the performance behavior of `sycl::detail::memcpy` is different to the performance behavior of `std::memcpy`.  In my tests, performance is up to 2x better with `std::memcpy`.

**Describe the solution you would like**
I think there are two options:
1. Support `std::memcpy` in device code.
This appears to work already, but the function isn't explicitly listed [here](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/C-CXX-StandardLibrary/C-CXX-StandardLibrary.rst).  This is my preferred solution, because it would allow us to call `std::memcpy` in the implementation and for users to call `std::memcpy` in their kernels.

2. Implement `sycl::detail::memcpy` the same way as `std::memcpy` where possible.
The implementation of `sycl::detail::memcpy` [here](https://github.com/intel/llvm/blob/sycl/sycl/include/CL/sycl/detail/helpers.hpp#L37) is just a simple loop, and the compiler doesn't seem to optimize this as aggressively as it does `std::memcpy`.  Making `sycl::detail::memcpy` faster wouldn't help user code, but would improve performance for those parts of the implementation currently relying on it.

**Describe alternatives you have considered**
Calling `__builtin_memcpy` might also work, but adding a third variant of `memcpy` to the mix seems more confusing.

**Additional context**
I think there are [other headers](https://github.com/intel/llvm/blob/sycl/sycl/include/CL/sycl/ONEAPI/matrix/matrix-amx.hpp) that currently assume `std::memcpy` works in device kernels, and I wouldn't be surprised if there was also user code relying on this behavior.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support std::memcpy or improve detail::memcpy #3816

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support std::memcpy or improve detail::memcpy #3816

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions