|
| 1 | +.. _queues: |
| 2 | + |
| 3 | +##### |
| 4 | +Queue |
| 5 | +##### |
| 6 | + |
| 7 | +A queue is needed to schedule execution of any computation or data copying on a |
| 8 | +device. Queue construction requires specifying a device and a context targeting |
| 9 | +that device as well as additional properties, such as whether profiling |
| 10 | +information should be collected or whether submitted tasks are executed in the |
| 11 | +order in which they were submitted. |
| 12 | + |
| 13 | +The :class:`dpctl.SyclQueue` class represents a queue and abstracts the |
| 14 | +:sycl_queue:`sycl::queue <>` SYCL runtime class. |
| 15 | + |
| 16 | +Types of Queues |
| 17 | +--------------- |
| 18 | + |
| 19 | +SYCL has a task-based execution model. The order in which a SYCL runtime |
| 20 | +executes a task on a target device is specified by a sequence of events which |
| 21 | +must be complete before execution of the task is allowed. Submission of a task |
| 22 | +returns an event which can be used to further grow the graph of computational |
| 23 | +tasks. A SYCL queue stores the needed data to manage the scheduling operations. |
| 24 | + |
| 25 | +Unless specified otherwise during constriction of a queue, a SYCL runtime |
| 26 | +executes tasks whose dependencies were met in an unspecified order, with |
| 27 | +possibility for some of the tasks to be execute concurrently. Such queues are |
| 28 | +said to be 'out-of-order'. |
| 29 | + |
| 30 | +SYCL queues can be specified to indicate that runtime must execute tasks in the |
| 31 | +order in which they were submitted. In this case, tasks submitted to such a |
| 32 | +queue, called 'in-order' queues, are never executed concurrently. |
| 33 | + |
| 34 | +Creating a New Queue |
| 35 | +-------------------- |
| 36 | + |
| 37 | +:class:`dpctl.SyclQueue(ctx, dev, property=None)` creates a new queue instance |
| 38 | +for the given compatible context and device. Keyword parameter `property` can be |
| 39 | +set to `"in_order"` to create an 'in-order' queue and to `"enable_profiling"` to |
| 40 | +dynamically collect task execution statistics in the returned event once the |
| 41 | +associated task completes. |
| 42 | + |
| 43 | +.. _fig-constructing-queue-context-device-property: |
| 44 | + |
| 45 | +.. literalinclude:: ../../../../../examples/python/sycl_queue.py |
| 46 | + :language: python |
| 47 | + :lines: 17-19, 72-89 |
| 48 | + :caption: Constructing SyclQueue from context and device |
| 49 | + :linenos: |
| 50 | + |
| 51 | +A possible output for the example |
| 52 | +:ref:`fig-constructing-queue-context-device-property:` may be: |
| 53 | + |
| 54 | +.. program-output:: python ../examples/python/sycl_queue.py -r create_queue_from_subdevice_multidevice_context |
| 55 | + |
| 56 | +When a context is not specified the :sycl_queue:`sycl::queue <>` constructor |
| 57 | +from a device instance is called. Instead of an instance of |
| 58 | +:class:`dpctl.SyclDevice` the argument `dev` can be a valid filter selector |
| 59 | +string. In this case, the :sycl_queue:`sycl::queue <>` constructor with the |
| 60 | +corresponding :oneapi_filter_selection:`sycl::ext::oneapi::filter_selector <>` |
| 61 | +is called. |
| 62 | + |
| 63 | +.. _fig-constructing-queue-filter-selector: |
| 64 | + |
| 65 | +.. literalinclude:: ../../../../../examples/python/sycl_queue.py |
| 66 | + :language: python |
| 67 | + :lines: 17-19, 27-37 |
| 68 | + :caption: Constructing SyclQueue from filter selector |
| 69 | + :linenos: |
| 70 | + |
| 71 | +A possible output for the example :ref:`fig-constructing-queue-filter-selector` |
| 72 | +may be: |
| 73 | + |
| 74 | +.. program-output:: python ../examples/python/sycl_queue.py -r create_queue_from_filter_selector |
| 75 | + |
| 76 | + |
| 77 | +Profiling a Task Submitted to a Queue |
| 78 | +------------------------------------- |
| 79 | + |
| 80 | +The result of scheduling execution of a task on a queue is an event. An event |
| 81 | +has several uses: it can be queried for the status of the task execution, it can |
| 82 | +be used to order execution of the future tasks after it is complete, it can be |
| 83 | +used to wait for execution to complete, and it can carry information to profile |
| 84 | +of the task execution. The profiling information is only populated if the queue |
| 85 | +used was created with the "enable_profiling" property and only becomes available |
| 86 | +after the task execution is complete. |
| 87 | + |
| 88 | +The class :class:`dpctl.SyclTimer` implements a Python context manager that can |
| 89 | +be used to collect cumulative profiling information for all the tasks submitted |
| 90 | +to the queue of interest by functions executed within the context: |
| 91 | + |
| 92 | +.. code-block:: python |
| 93 | + :caption: Example of timing execution |
| 94 | +
|
| 95 | + import dpctl import dpctl.tensor as dpt |
| 96 | +
|
| 97 | + q = dpctl.SyclQueue(property="enable_profiling") timer_ctx = |
| 98 | + dpctl.SyclTimer() with timer_ctx(q): |
| 99 | + X = dpt.arange(10**6, dtype=float, sycl_queue=q) |
| 100 | +
|
| 101 | + host_dt, device_dt = timer_ctx.dt |
| 102 | +
|
| 103 | +The timer leverages :oneapi_enqueue_barrier:`oneAPI enqueue_barrier SYCL |
| 104 | +extension <>` and submits a barrier at context entrance and a barrier at context |
| 105 | +exit and records associated events. The elapsed device time is computed as |
| 106 | +``e_exit.profiling_info_start - e_enter.profiling_info_end``. |
0 commit comments