Skip to content

[SYCL] Enable PI unit testing on multiple plugins. #1694

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
May 20, 2020
Merged
17 changes: 12 additions & 5 deletions sycl/include/CL/sycl/detail/pi.h
Original file line number Diff line number Diff line change
Expand Up @@ -132,9 +132,16 @@ typedef enum {
// make the translation to OpenCL transparent.
//
typedef enum : pi_uint64 {
PI_DEVICE_TYPE_CPU = CL_DEVICE_TYPE_CPU,
PI_DEVICE_TYPE_GPU = CL_DEVICE_TYPE_GPU,
PI_DEVICE_TYPE_ACC = CL_DEVICE_TYPE_ACCELERATOR
PI_DEVICE_TYPE_DEFAULT =
CL_DEVICE_TYPE_DEFAULT, ///< The default device available in the PI
///< plugin.
PI_DEVICE_TYPE_ALL =
CL_DEVICE_TYPE_ALL, ///< All devices available in the PI plugin.
PI_DEVICE_TYPE_CPU =
CL_DEVICE_TYPE_CPU, ///< A PI device that is the host processor.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it's a 100% accurate. I assume it's possible to have CPU device != host processor.

Copy link
Contributor Author

@nyalloc nyalloc May 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the wording OpenCL uses for their equivalent enumerator. I guess what you're thinking of would be more along the lines of an accelerator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting... I always thought that OpenCL spec doesn't force the implementation to use the host processor on multi-processor systems. I think it's possible to configure Intel OpenCL CPU implementation to run on a "non-host" processor.

This might be just an typo in the spec...
@mkinsner, @bashbaug, do you have any thoughts regarding this wording?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my view, PI API is very close to OpenCL by design so it makes sense to me to keep the language here similar to the original wording. If it turns out to be a typo in the OpenCL spec, I'll happily update it with a follow-up PR. It is important that we re-enable PI API testing ASAP.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's very minor comment and it shouldn't block merging this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This same text is in the latest OpenCL spec also. It hasn't changed since v1.0:

https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_API.html#CL_DEVICE_TYPE_CPU

I'll get this tidied up in the spec. I agree it shouldn't hold up this PR.

PI_DEVICE_TYPE_GPU = CL_DEVICE_TYPE_GPU, ///< A PI device that is a GPU.
PI_DEVICE_TYPE_ACC = CL_DEVICE_TYPE_ACCELERATOR ///< A PI device that is a
///< dedicated accelerator.
} _pi_device_type;

typedef enum {
Expand Down Expand Up @@ -1422,9 +1429,9 @@ struct _pi_plugin {
// Some choices are:
// - Use of integers to keep major and minor version.
// - Keeping char* Versions.
const char PiVersion[4] = _PI_H_VERSION_STRING;
Copy link
Contributor Author

@nyalloc nyalloc May 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This const member was causing issues, as the compiler was no longer able to generate a default copy assignment operator, making it difficult to pass a vector of plugins to the algorithms I use in this PR. I could write one by hand, but copy constructors and other C++ language features have no place in this C API header.

Additionally, these in-class member initialisers are a C++ language feature. If I'm not mistaken, it generates a default constructor where these two members are initalized. Because of this, I moved the initialisation of these members to pi::initialize function in pi.cpp. _pi_plugin is now a POD because of these changes. There are a few C++ features in pi.h that would fail to compile with C, but these are the only things relevant to this PR.

char PiVersion[4];
// Plugin edits this.
char PluginVersion[4] = _PI_H_VERSION_STRING;
char PluginVersion[4];
char *Targets;
struct FunctionPointers {
#define _PI_API(api) decltype(::api) *api;
Expand Down
9 changes: 6 additions & 3 deletions sycl/plugins/cuda/pi_cuda.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -716,15 +716,18 @@ pi_result cuda_piDevicesGet(pi_platform platform, pi_device_type device_type,
pi_uint32 *num_devices) {

pi_result err = PI_SUCCESS;
const bool askingForGPU = (device_type & PI_DEVICE_TYPE_GPU);
size_t numDevices = askingForGPU ? platform->devices_.size() : 0;
const bool askingForDefault = device_type == PI_DEVICE_TYPE_DEFAULT;
const bool askingForGPU = device_type & PI_DEVICE_TYPE_GPU;
const bool returnDevices = askingForDefault || askingForGPU;

size_t numDevices = returnDevices ? platform->devices_.size() : 0;

try {
if (num_devices) {
*num_devices = numDevices;
}

if (askingForGPU && devices) {
if (returnDevices && devices) {
for (size_t i = 0; i < std::min(size_t(num_entries), numDevices); ++i) {
devices[i] = platform->devices_[i].get();
}
Expand Down
4 changes: 3 additions & 1 deletion sycl/source/detail/pi.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -205,7 +205,9 @@ vector_class<plugin> initialize() {
std::cerr << "SYCL_PI_TRACE[all]: "
<< "No Plugins Found." << std::endl;

PiPlugin PluginInformation;
PiPlugin PluginInformation{_PI_H_VERSION_STRING, _PI_H_VERSION_STRING,
nullptr};

for (unsigned int I = 0; I < PluginNames.size(); I++) {
void *Library = loadPlugin(PluginNames[I].first);

Expand Down
7 changes: 6 additions & 1 deletion sycl/source/detail/plugin.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,11 @@ class plugin {
plugin(RT::PiPlugin Plugin, backend UseBackend)
: MPlugin(Plugin), MBackend(UseBackend) {}

plugin &operator=(const plugin &) = default;
plugin(const plugin &) = default;
plugin &operator=(plugin &&other) noexcept = default;
plugin(plugin &&other) noexcept = default;
Comment on lines +30 to +33
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting... What made you explicitly default these things?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compiler was deleting a few of these implicitly because pi_plugin's constant members. I was trying to narrow in on the source of the error, as it prevented me from passing plugins to standard algorithms that required the type to be copy assignable. Because this type encapsulates pi_plugin object, the compiler decided to delete its copy assignment and copy constructor. Now that pi_plugin has been fixed, you could probably remove these explicit defaults to no ill effect, but I decided to leave them in. I prefer all special member functions to be explicitly defined if any of them are present, as I can reason about the class' behavior without having to guess at what the compiler will do.


~plugin() = default;

const RT::PiPlugin &getPiPlugin() const { return MPlugin; }
Expand Down Expand Up @@ -79,7 +84,7 @@ class plugin {

private:
RT::PiPlugin MPlugin;
const backend MBackend;
backend MBackend;
}; // class plugin
} // namespace detail
} // namespace sycl
Expand Down
23 changes: 23 additions & 0 deletions sycl/unittests/pi/BackendString.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The header seems to be a little different from what the rest of the project uses.

// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

#pragma once

#include <detail/plugin.hpp>

namespace pi {
inline const char *GetBackendString(cl::sycl::backend backend) {
switch (backend) {
#define PI_BACKEND_STR(backend_name) \
case cl::sycl::backend::backend_name: \
return #backend_name
PI_BACKEND_STR(cuda);
PI_BACKEND_STR(host);
PI_BACKEND_STR(opencl);
#undef PI_BACKEND_STR
default:
return "Unknown Plugin";
}
}
} // namespace pi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is screaming for a blank line.

2 changes: 1 addition & 1 deletion sycl/unittests/pi/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ set(CMAKE_CXX_EXTENSIONS OFF)

# Enable exception handling for these unit tests
set(LLVM_REQUIRES_EH 1)
add_sycl_unittest(PiTests OBJECT
add_sycl_unittest(PiTests OBJECT
EnqueueMemTest.cpp
PiMock.cpp
PlatformTest.cpp
Expand Down
118 changes: 59 additions & 59 deletions sycl/unittests/pi/EnqueueMemTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6,17 +6,16 @@
//
//===----------------------------------------------------------------------===//

#include "TestGetPlugin.hpp"
#include <CL/sycl/detail/pi.hpp>
#include <detail/plugin.hpp>
#include <gtest/gtest.h>

using namespace cl::sycl;

namespace {
class DISABLED_EnqueueMemTest : public ::testing::Test {
class EnqueueMemTest : public testing::TestWithParam<detail::plugin> {
protected:
std::vector<detail::plugin> Plugins;

constexpr static size_t _numElementsX = 8;
constexpr static size_t _numElementsY = 4;

Expand All @@ -25,124 +24,125 @@ class DISABLED_EnqueueMemTest : public ::testing::Test {
pi_queue _queue = nullptr;
pi_mem _mem = nullptr;

DISABLED_EnqueueMemTest() = default;
EnqueueMemTest() = default;

~DISABLED_EnqueueMemTest() = default;
~EnqueueMemTest() = default;

void SetUp() override {
Plugins = detail::pi::initialize();
ASSERT_FALSE(Plugins.empty());

detail::plugin plugin = GetParam();

pi_platform platform = nullptr;
ASSERT_EQ((Plugins[0].call_nocheck<detail::PiApiKind::piPlatformsGet>(
ASSERT_EQ((plugin.call_nocheck<detail::PiApiKind::piPlatformsGet>(
1, &platform, nullptr)),
PI_SUCCESS);

ASSERT_EQ((Plugins[0].call_nocheck<detail::PiApiKind::piDevicesGet>(
platform, PI_DEVICE_TYPE_GPU, 1, &_device, nullptr)),
ASSERT_EQ((plugin.call_nocheck<detail::PiApiKind::piDevicesGet>(
platform, PI_DEVICE_TYPE_DEFAULT, 1, &_device, nullptr)),
PI_SUCCESS);

pi_result result = PI_INVALID_VALUE;
result = Plugins[0].call_nocheck<detail::PiApiKind::piContextCreate>(
result = plugin.call_nocheck<detail::PiApiKind::piContextCreate>(
nullptr, 1u, &_device, nullptr, nullptr, &_context);
ASSERT_EQ(result, PI_SUCCESS);

ASSERT_EQ((Plugins[0].call_nocheck<detail::PiApiKind::piQueueCreate>(
ASSERT_EQ((plugin.call_nocheck<detail::PiApiKind::piQueueCreate>(
_context, _device, 0, &_queue)),
PI_SUCCESS);

ASSERT_EQ((Plugins[0].call_nocheck<detail::PiApiKind::piMemBufferCreate>(
ASSERT_EQ((plugin.call_nocheck<detail::PiApiKind::piMemBufferCreate>(
_context, 0, _numElementsX * _numElementsY * sizeof(pi_int32),
nullptr, &_mem)),
PI_SUCCESS);
}

void TearDown() override {
ASSERT_EQ((Plugins[0].call_nocheck<detail::PiApiKind::piMemRelease>(_mem)),

detail::plugin plugin = GetParam();

ASSERT_EQ((plugin.call_nocheck<detail::PiApiKind::piMemRelease>(_mem)),
PI_SUCCESS);
ASSERT_EQ((plugin.call_nocheck<detail::PiApiKind::piQueueRelease>(_queue)),
PI_SUCCESS);
ASSERT_EQ(
(Plugins[0].call_nocheck<detail::PiApiKind::piQueueRelease>(_queue)),
(plugin.call_nocheck<detail::PiApiKind::piContextRelease>(_context)),
PI_SUCCESS);
ASSERT_EQ((Plugins[0].call_nocheck<detail::PiApiKind::piContextRelease>(
_context)),
PI_SUCCESS);
}

template <typename T> void TestBufferFill(const T &pattern) {

detail::plugin plugin = GetParam();

T inValues[_numElementsX] = {};

for (size_t i = 0; i < _numElementsX; ++i) {
ASSERT_NE(pattern, inValues[i]);
}

ASSERT_EQ(
(Plugins[0].call_nocheck<detail::PiApiKind::piEnqueueMemBufferWrite>(
_queue, _mem, PI_TRUE, 0, _numElementsX * sizeof(T), inValues, 0,
nullptr, nullptr)),
PI_SUCCESS);
ASSERT_EQ((plugin.call_nocheck<detail::PiApiKind::piEnqueueMemBufferWrite>(
_queue, _mem, PI_TRUE, 0, _numElementsX * sizeof(T), inValues,
0, nullptr, nullptr)),
PI_SUCCESS);

ASSERT_EQ(
(Plugins[0].call_nocheck<detail::PiApiKind::piEnqueueMemBufferFill>(
_queue, _mem, &pattern, sizeof(T), 0, sizeof(inValues), 0, nullptr,
nullptr)),
PI_SUCCESS);
ASSERT_EQ((plugin.call_nocheck<detail::PiApiKind::piEnqueueMemBufferFill>(
_queue, _mem, &pattern, sizeof(T), 0, sizeof(inValues), 0,
nullptr, nullptr)),
PI_SUCCESS);

T outValues[_numElementsX] = {};
ASSERT_EQ(
(Plugins[0].call_nocheck<detail::PiApiKind::piEnqueueMemBufferRead>(
_queue, _mem, PI_TRUE, 0, _numElementsX * sizeof(T), outValues, 0,
nullptr, nullptr)),
PI_SUCCESS);
ASSERT_EQ((plugin.call_nocheck<detail::PiApiKind::piEnqueueMemBufferRead>(
_queue, _mem, PI_TRUE, 0, _numElementsX * sizeof(T),
outValues, 0, nullptr, nullptr)),
PI_SUCCESS);

for (size_t i = 0; i < _numElementsX; ++i) {
ASSERT_EQ(pattern, outValues[i]);
}
}
};

template<typename T>
struct vec4 {
static std::vector<detail::plugin> Plugins = pi::initializeAndRemoveInvalid();

INSTANTIATE_TEST_CASE_P(
EnqueueMemTestImpl, EnqueueMemTest, testing::ValuesIn(Plugins),
[](const testing::TestParamInfo<EnqueueMemTest::ParamType> &info) {
return pi::GetBackendString(info.param.getBackend());
});

template <typename T> struct vec4 {
T x, y, z, w;

bool operator==(const vec4 &rhs) const {
return x == rhs.x && y == rhs.y && z == rhs.z && w == rhs.w;
}

bool operator!=(const vec4 &rhs) const {
return !(*this == rhs);
}
bool operator!=(const vec4 &rhs) const { return !(*this == rhs); }
};

template<typename T>
struct vec2 {
template <typename T> struct vec2 {
T x, y;

bool operator==(const vec2 &rhs) const {
return x == rhs.x && y == rhs.y;
}
bool operator==(const vec2 &rhs) const { return x == rhs.x && y == rhs.y; }

bool operator!=(const vec2 &rhs) const {
return !(*this == rhs);
}
bool operator!=(const vec2 &rhs) const { return !(*this == rhs); }
};

TEST_F(DISABLED_EnqueueMemTest, piEnqueueMemBufferFill) {
TEST_P(EnqueueMemTest, piEnqueueMemBufferFill) {

TestBufferFill(float{1});
TestBufferFill(vec2<float>{1, 2});
TestBufferFill(vec4<float>{1, 2, 3, 4});
TestBufferFill(float{1});
TestBufferFill(vec2<float>{1, 2});
TestBufferFill(vec4<float>{1, 2, 3, 4});

TestBufferFill(uint8_t{1});
TestBufferFill(vec2<uint8_t>{1, 2});
TestBufferFill(vec4<uint8_t>{1, 2, 3, 4});
TestBufferFill(uint8_t{1});
TestBufferFill(vec2<uint8_t>{1, 2});
TestBufferFill(vec4<uint8_t>{1, 2, 3, 4});

TestBufferFill(uint16_t{1});
TestBufferFill(vec2<uint16_t>{1, 2});
TestBufferFill(vec4<uint16_t>{1, 2, 3, 4});
TestBufferFill(uint16_t{1});
TestBufferFill(vec2<uint16_t>{1, 2});
TestBufferFill(vec4<uint16_t>{1, 2, 3, 4});

TestBufferFill(uint32_t{1});
TestBufferFill(vec2<uint32_t>{1, 2});
TestBufferFill(vec4<uint32_t>{1, 2, 3, 4});
TestBufferFill(uint32_t{1});
TestBufferFill(vec2<uint32_t>{1, 2});
TestBufferFill(vec4<uint32_t>{1, 2, 3, 4});
}
} // namespace
Loading