Make dppl rt threadlocal #9

diptorupd · 2020-08-10T00:04:41Z

Stores the sycl device queues in a thread local storage. The DpplRuntime object is no longer exposed, but new global prebound functions exposes the same API.

…d functions.

diptorupd · 2020-08-10T00:06:13Z

Also contains the changes from PR #7

setup.py

…l into make_dppl_rt_threadlocal

DrTodd13 · 2020-08-11T00:04:13Z

dppl/oneapi_interface.pyx

+
+# Initialize a thread local instance of _DpplRuntime
+_tls = threading_local()
+_tls._in_dppl_ctxt = False


Not sure we ever discussed this scenario. However, I think I agree that in a new thread, you go back to the default of not being in a context.

I am removing the flag. Instead of the flag let me add a public getter that returns True/False based on if we are inside any context by checking the size of the stack of contexts.

Made the change in 8318d44

DrTodd13 · 2020-08-11T00:06:52Z

dppl/oneapi_interface.pyx

-            print("Removing the context from the deque of active contexts")
-            runtime._deactivate_current_queue()
+            _tls._runtime._deactivate_current_queue()
+            _tls._in_dppl_ctxt = False


Shouldn't this be a virtual stack of contexts? I would think that on "finally" that you pop the top off the stack and then only set in_dppl_ctxt to false if the stack is empty. Even better, you don't have this in_dppl_ctxt at all and just check the size of the stack if you need to know if you're in a context or not.

The stack of contexts is maintained inside the _runtime.rt object. I am using a std::deque to store the list of currently activated queues. The first entry of the deque is always the global or default queue. Using an example:

default_q = dppl.get_current_queue() with dppl.device_context(dppl.device_type.cpu, 0) as cpu_q: ... with dppl.device_context(dppl.device_type.gpu, 0) as gpu_0_q: ... with dppl.device_context(dppl.device_type.gpu, 1) as gpu_1_q: ...

Here the stack will be maintained inside dppl._tls._runtime as:

default_q (initial state)
cpu_q --> default_q (on entering first with context)
gpu_0_q --> cpu_q --> default_q (on entering the first nested with context)
cpu_q --> default_q (on exiting the first nested with context)
gpu_1_q --> cpu_q --> default_q (on entering the second nested with context)
cpu_q --> default_q (on exiting the second nested with context)
default_q (on exiting the top-level with context)

The _deactivate_current_queue() call does the pop from the deque. dppl.get_current_queue always returns the current top of stack queue.

About your second point about in_dppl_ctxt, yes the way it is currently implemented it returns wrong result when contexts are nested. Let me fix it to return the size of the deque -1 (since the default queue is always going to be inside the deque).

Made the change in 8318d44

-- Adds a new helper function to check the size of the deque to infer if inside a device_context. Works for nested contexts as well. -- Add unit test cases.

fschlimb · 2020-08-11T09:16:27Z

dppl/oneapi_interface.pyx

+_tls = threading_local()
+_tls._runtime = _DpplRuntime()
+
+
+################################################################################
+#--------------------------------- Public API ---------------------------------#
+################################################################################
+
+
+dump              = _tls._runtime.dump
+dump_queue_info   = _tls._runtime.dump_queue_info
+get_current_queue = _tls._runtime.get_current_queue
+get_num_platforms = _tls._runtime.get_num_platforms
+set_default_queue = _tls._runtime.set_default_queue
+is_in_dppl_ctxt   = _tls._runtime.is_in_dppl_ctxt


Why is the entire runtime thread-local? The only thing which is thread-local is the stack of contexts. Making shared data thread-local is calling for trouble.
I think the runtime class should encapsulate the thread-local business and return the thread-local values in its member functions like get_current_queue. Maybe this needs to be done in C++ (e.g. your std::deque be thread-local).

diptorupd added 2 commits August 9, 2020 18:56

improve comment

5a5da04

Make the runtime thread local and expose the API using global preboun…

ec5c5d4

…d functions.

diptorupd requested review from DrTodd13 and PokhodenkoSA August 10, 2020 00:04

PokhodenkoSA reviewed Aug 10, 2020

View reviewed changes

setup.py Show resolved Hide resolved

diptorupd added 3 commits August 10, 2020 17:21

Import setuptools before sython.

22041b7

Import setuptools before cython.

79508a2

Merge branch 'make_dppl_rt_threadlocal' of github.com:diptorupd/pydpp…

c726395

…l into make_dppl_rt_threadlocal

PokhodenkoSA approved these changes Aug 10, 2020

View reviewed changes

DrTodd13 suggested changes Aug 11, 2020

View reviewed changes

diptorupd added 2 commits August 11, 2020 01:26

Ignore PyDev project files.

fb468df

Removes the flag to report if inside a dppl device_context.

8318d44

-- Adds a new helper function to check the size of the deque to infer if inside a device_context. Works for nested contexts as well. -- Add unit test cases.

diptorupd requested a review from DrTodd13 August 11, 2020 06:38

fschlimb requested changes Aug 11, 2020

View reviewed changes

diptorupd added 3 commits August 11, 2020 19:12

Merge branch 'master' of github.com:IntelPython/pydppl

0b69284

Merge branch 'master' of github.com:IntelPython/pydppl

c80e5ab

Merge branch 'master' into make_dppl_rt_threadlocal

364d3f0

diptorupd marked this pull request as draft August 13, 2020 23:07

diptorupd closed this Aug 18, 2020

diptorupd deleted the make_dppl_rt_threadlocal branch August 18, 2020 23:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make dppl rt threadlocal #9

Make dppl rt threadlocal #9

Uh oh!

diptorupd commented Aug 10, 2020

Uh oh!

diptorupd commented Aug 10, 2020

Uh oh!

Uh oh!

DrTodd13 Aug 11, 2020

Uh oh!

diptorupd Aug 11, 2020

Uh oh!

diptorupd Aug 11, 2020

Uh oh!

DrTodd13 Aug 11, 2020

Uh oh!

diptorupd Aug 11, 2020

Uh oh!

diptorupd Aug 11, 2020

Uh oh!

fschlimb Aug 11, 2020

Uh oh!

Uh oh!

Make dppl rt threadlocal #9

Make dppl rt threadlocal #9

Uh oh!

Conversation

diptorupd commented Aug 10, 2020

Uh oh!

diptorupd commented Aug 10, 2020

Uh oh!

Uh oh!

DrTodd13 Aug 11, 2020

Choose a reason for hiding this comment

Uh oh!

diptorupd Aug 11, 2020

Choose a reason for hiding this comment

Uh oh!

diptorupd Aug 11, 2020

Choose a reason for hiding this comment

Uh oh!

DrTodd13 Aug 11, 2020

Choose a reason for hiding this comment

Uh oh!

diptorupd Aug 11, 2020

Choose a reason for hiding this comment

Uh oh!

diptorupd Aug 11, 2020

Choose a reason for hiding this comment

Uh oh!

fschlimb Aug 11, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!