-
Notifications
You must be signed in to change notification settings - Fork 794
[SYCL] Fix acquiring a mutex in _pi_context::finalize #3001
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I observed crashes in ESIMD tests at the exit of _pi_context::finalize function. Unlocking not owned mutex is Undefined Behavior. Crashes occured only on Debug builds on Windows platforms when using LevelZero. I saw the following error: "unlock of unowned mutex".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was working on this issue today.
And thanks for your confirmation.
@@ -459,7 +459,7 @@ pi_result _pi_context::finalize() { | |||
// There could be some memory that may have not been deallocated. | |||
// For example, zeEventPool could be still alive. | |||
std::lock_guard<std::mutex> NumEventsLiveInEventPoolGuard( | |||
NumEventsLiveInEventPoolMutex, std::adopt_lock); | |||
NumEventsLiveInEventPoolMutex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This indeed looks like copy-paste error to me, and the fix looks good.
Still, I wonder how this worked before. Basically, all the tests, including non-ESIMD should have been failing the same way, as it does not seem any thread can come here holding the NumEventsLiveInEventPoolMutex
. Leaving to @smaslov-intel to review/approve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it wasn't noticed earlier because it only appears on debug builds on Windows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* sycl: [sycl-post-link][NFC] Extracted the code into a subroutine (intel#3042) [SYCL][NFC] Remove commented out code (intel#3029) [CODEOWNERS] Fix ownership of DPC++ tools tests (intel#3047) [SYCL][NFC] Make tests insensitive to dso_local (intel#3037) [SYCL] Fix acquiring a mutex in _pi_context::finalize (intel#3001) [SYCL] Fix various compilation warnings in plugins (intel#2979)
* sycl: (378 commits) [sycl-post-link][NFC] Extracted the code into a subroutine (intel#3042) [SYCL][NFC] Remove commented out code (intel#3029) [CODEOWNERS] Fix ownership of DPC++ tools tests (intel#3047) [SYCL][NFC] Make tests insensitive to dso_local (intel#3037) [SYCL] Fix acquiring a mutex in _pi_context::finalize (intel#3001) [SYCL] Fix various compilation warnings in plugins (intel#2979) [SYCL][ESIMD] Add simd class conversion ctor and operator (intel#3028) [sycl-post-link][NFC] Use range-based for loop. (intel#3033) [SYCL][NFC] Fix warning in self-build (intel#3023) [NFC] Fix sycl-post-link tests to avoid potential race (intel#3031) [SYCL][CUDA] Add missing barrier to collectives (intel#2990) [SYCL] Make Intel attributes consistent with clang attributes. (intel#3022) [SYCL] Bump SYCL minor version (intel#3026) [SYCL][Doc] Added requirement on reference to test PR in commit message (intel#3010) [SYCL] Put constant initializer list data in non-generic addr space. (intel#3005) [SYCL][L0] Fix memory leak in PiDeviceCache and ZeCommandList (intel#2974) [SYCL] Fix detection of free function calls (intel#3003) [SYCL][NFC] Clean up the builder_dir argument description (intel#3021) [SYCL][ESIMD] Fix LowerESIMD crash on a scalar fptoui LLVM instruction (intel#2699) [NFC] Remove redundant call to getMainExecutable() (intel#3018) ...
I observed crashes in ESIMD tests at the exit of _pi_context::finalize
function. Unlocking not owned mutex is Undefined Behavior.
Crashes occured only on Debug builds on Windows platforms when using
LevelZero. I saw the following error: "unlock of unowned mutex".