From beb986e015423a9c5b9a4feb03fb069d9187ba0e Mon Sep 17 00:00:00 2001 From: Diptorup Deb Date: Fri, 7 Jan 2022 14:29:57 -0600 Subject: [PATCH 1/7] Add a stub for queues to the user manual. --- docs/docfiles/user_guides/UserManual.rst | 3 +++ .../user_guides/manual/dpctl/intro.rst | 5 ++--- .../user_guides/manual/dpctl/platforms.rst | 2 +- .../user_guides/manual/dpctl/queues.rst | 19 +++++++++++++++++++ 4 files changed, 25 insertions(+), 4 deletions(-) create mode 100644 docs/docfiles/user_guides/manual/dpctl/queues.rst diff --git a/docs/docfiles/user_guides/UserManual.rst b/docs/docfiles/user_guides/UserManual.rst index 9b955f1b0c..1fd2069fb4 100644 --- a/docs/docfiles/user_guides/UserManual.rst +++ b/docs/docfiles/user_guides/UserManual.rst @@ -4,6 +4,9 @@ User Manual ########### +Table of contents ++++++++++++++++++ + .. toctree:: :maxdepth: 3 diff --git a/docs/docfiles/user_guides/manual/dpctl/intro.rst b/docs/docfiles/user_guides/manual/dpctl/intro.rst index 327178919e..341a75f8e8 100644 --- a/docs/docfiles/user_guides/manual/dpctl/intro.rst +++ b/docs/docfiles/user_guides/manual/dpctl/intro.rst @@ -26,13 +26,12 @@ The user guide introduces the core features of dpctl and the underlying concepts. The guide is meant primarily for users of the Python package. Library and native extension developers should refer to the programmer's guide. -Table of contents -+++++++++++++++++ - .. toctree:: :maxdepth: 2 + :caption: Table of Contents basic_concepts device_selection platforms devices + queues diff --git a/docs/docfiles/user_guides/manual/dpctl/platforms.rst b/docs/docfiles/user_guides/manual/dpctl/platforms.rst index bf9c0ed981..c774847658 100644 --- a/docs/docfiles/user_guides/manual/dpctl/platforms.rst +++ b/docs/docfiles/user_guides/manual/dpctl/platforms.rst @@ -1,4 +1,4 @@ -.. _querying_platforms: +.. _platforms: ######## Platform diff --git a/docs/docfiles/user_guides/manual/dpctl/queues.rst b/docs/docfiles/user_guides/manual/dpctl/queues.rst new file mode 100644 index 0000000000..ef2a32c4ff --- /dev/null +++ b/docs/docfiles/user_guides/manual/dpctl/queues.rst @@ -0,0 +1,19 @@ +.. _queues: + +##### +Queue +##### + +A queue is used to specify a device and a specific set of features of that +device on which a task is scheduled. The :class:`dpctl.SyclQueue` class +represents a queue and abstracts the :sycl_queue:`sycl::queue <>` SYCL runtime +class. + +Types of Queues +--------------- + +Creating a New Queue +-------------------- + +Profiling a Task Submitted to a Queue +------------------------------------- From fecc96dd139f0641c85180ea33fd36e3989f3a0b Mon Sep 17 00:00:00 2001 From: Oleksandr Pavlyk Date: Wed, 20 Apr 2022 19:19:22 -0500 Subject: [PATCH 2/7] Filled out some porition of the queue.rst --- .../user_guides/manual/dpctl/queues.rst | 66 +++++++++++++++++-- 1 file changed, 62 insertions(+), 4 deletions(-) diff --git a/docs/docfiles/user_guides/manual/dpctl/queues.rst b/docs/docfiles/user_guides/manual/dpctl/queues.rst index ef2a32c4ff..0930d2ed20 100644 --- a/docs/docfiles/user_guides/manual/dpctl/queues.rst +++ b/docs/docfiles/user_guides/manual/dpctl/queues.rst @@ -4,16 +4,74 @@ Queue ##### -A queue is used to specify a device and a specific set of features of that -device on which a task is scheduled. The :class:`dpctl.SyclQueue` class -represents a queue and abstracts the :sycl_queue:`sycl::queue <>` SYCL runtime -class. +A queue is needed to schedule execution of any computation, or data +copying on the device. Queue construction requires specifying a device +and a context targeting that device as well as additional properties, +such as whether profiling information should be collected or whether +submitted tasks are executed in the order in which they were submitted. + +The :class:`dpctl.SyclQueue` class represents a queue and abstracts +the :sycl_queue:`sycl::queue <>` SYCL runtime class. Types of Queues --------------- +In SYCL tasks are submitted for execution by the SYCL runtime. The order +in which runtime executes them on the target device is specified by +a sequence of events which must be complete before execution is allowed. +Submission of a task returns an event which can be used to further grow +the graph of computational tasks. SYCL queue stores data needed to manage +this scheduling operations. + +Unless specified otherwise during constriction of a queue, SYCL runtime +executes tasks whose dependencies were met in an unspecified order, +with possibility for some of the tasks be execute concurrently. Such +queues are said to be 'out-of-order'. + +SYCL queues can be specified to indicate that runtime must execute tasks +in the linear order in which they were submitted. In this case tasks submitted +to such a queue, called 'in-order' queues, are never executed concurrently. + Creating a New Queue -------------------- +:class:`dpctl.SyclQueue(ctx, dev, property=None)` creates a new instance for +the given compatible context and device. Keyword parameter `property` +can be set to `"in_order"` to create an 'in-order' queue, to `"enable_profiling"` +to dynamically collect task execution statistics in the returned event once +the associated task completes. + +.. _fig-constructing-queue-context-device-property: + +.. literalinclude:: ../../../../../examples/python/sycl_queue.py + :language: python + :lines: 17-19, 67-79 + :caption: Constructing SyclQueue from context and device + :linenos: + +A possible output for the example :ref:`fig-constructing-queue-context-device-property:` may be: + +.. program-output:: python ../examples/python/sycl_queue.py -r create_queue_from_subdevice_multidevice_context + +When context is not specified the :sycl_queue:`sycl::queue <>` constructor +from a device instance is called. Instead of an instance of +:class:`dpctl.SyclDevice` the argument `dev` can be a valid filter +selector string. In this case the :sycl_queue:`sycl::queue <>` constructor +with the corresponding :filter_selector:`sycl::ext::oneapi::filter_selector` +is called. + +.. _fig-constructing-queue-filter-selector: + +.. literalinclude:: ../../../../../examples/python/sycl_queue.py + :language: python + :lines: 17-19, 27-37 + :caption: Constructing SyclQueue from filter selector + :linenos: + +A possible output for the example :ref:`fig-constructing-queue-filter-selector:` may be: + +.. program-output:: python ../examples/python/sycl_queue.py -r create_queue_from_filter_selector + + Profiling a Task Submitted to a Queue ------------------------------------- From b6de48e0804cbb854aec7f7ee69bc0a23ae3ac8e Mon Sep 17 00:00:00 2001 From: Oleksandr Pavlyk Date: Thu, 21 Apr 2022 08:03:17 -0500 Subject: [PATCH 3/7] fixed reference to filter selector --- docs/docfiles/intro.rst | 2 +- docs/docfiles/urls.json | 3 ++- docs/docfiles/user_guides/manual/dpctl/queues.rst | 2 +- 3 files changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/docfiles/intro.rst b/docs/docfiles/intro.rst index 892e66af72..ae4494c5be 100644 --- a/docs/docfiles/intro.rst +++ b/docs/docfiles/intro.rst @@ -12,5 +12,5 @@ deallocators. Dpctl's Python API provides classes that implement using SYCL USM memory; making it possible to create Python objects that are backed by SYCL USM memory. -Dpctl also supports the DPCPP ``ONEAPI::filter_selector`` extension and has +Dpctl also supports the DPCPP ``oneapi::filter_selector`` extension and has experimental support for SYCL's ``kernel`` and ``program`` classes. diff --git a/docs/docfiles/urls.json b/docs/docfiles/urls.json index 3e0906fc41..9c4c0a14d4 100644 --- a/docs/docfiles/urls.json +++ b/docs/docfiles/urls.json @@ -2,7 +2,8 @@ "dpcpp_envar": "https://github.com/intel/llvm/blob/sycl/sycl/doc/EnvironmentVariables.md", "numa_domain": "https://en.wikipedia.org/wiki/Non-uniform_memory_access", "oneapi": "https://www.oneapi.io/", - "oneapi_filter_selection": "https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/FilterSelector/FilterSelector.adoc", + "oneapi_filter_selection": "https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/supported/sycl_ext_oneapi_filter_selector.asciidoc", + "oneapi_default_context": "https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/supported/sycl_ext_oneapi_default_context.asciidoc", "sycl_aspects": "https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#table.device.aspect", "sycl_context": "https://sycl.readthedocs.io/en/latest/iface/context.html", "sycl_device": "https://sycl.readthedocs.io/en/latest/iface/device.html", diff --git a/docs/docfiles/user_guides/manual/dpctl/queues.rst b/docs/docfiles/user_guides/manual/dpctl/queues.rst index 0930d2ed20..fcb23a1afb 100644 --- a/docs/docfiles/user_guides/manual/dpctl/queues.rst +++ b/docs/docfiles/user_guides/manual/dpctl/queues.rst @@ -57,7 +57,7 @@ When context is not specified the :sycl_queue:`sycl::queue <>` constructor from a device instance is called. Instead of an instance of :class:`dpctl.SyclDevice` the argument `dev` can be a valid filter selector string. In this case the :sycl_queue:`sycl::queue <>` constructor -with the corresponding :filter_selector:`sycl::ext::oneapi::filter_selector` +with the corresponding :oneapi_filter_selection:`sycl::ext::oneapi::filter_selector <>` is called. .. _fig-constructing-queue-filter-selector: From ce342fd073164a2913155149c36dc9b99c7a2cf9 Mon Sep 17 00:00:00 2001 From: Oleksandr Pavlyk Date: Thu, 21 Apr 2022 08:51:40 -0500 Subject: [PATCH 4/7] fixed small blemishes --- docs/docfiles/intro.rst | 2 +- docs/docfiles/user_guides/manual/dpctl/queues.rst | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/docfiles/intro.rst b/docs/docfiles/intro.rst index ae4494c5be..3c7d1b86b8 100644 --- a/docs/docfiles/intro.rst +++ b/docs/docfiles/intro.rst @@ -12,5 +12,5 @@ deallocators. Dpctl's Python API provides classes that implement using SYCL USM memory; making it possible to create Python objects that are backed by SYCL USM memory. -Dpctl also supports the DPCPP ``oneapi::filter_selector`` extension and has +Dpctl also supports the DPCPP :oneapi_filter_selection:`oneapi::filter_selector <>` extension and has experimental support for SYCL's ``kernel`` and ``program`` classes. diff --git a/docs/docfiles/user_guides/manual/dpctl/queues.rst b/docs/docfiles/user_guides/manual/dpctl/queues.rst index fcb23a1afb..fa47b68739 100644 --- a/docs/docfiles/user_guides/manual/dpctl/queues.rst +++ b/docs/docfiles/user_guides/manual/dpctl/queues.rst @@ -68,7 +68,7 @@ is called. :caption: Constructing SyclQueue from filter selector :linenos: -A possible output for the example :ref:`fig-constructing-queue-filter-selector:` may be: +A possible output for the example :ref:`fig-constructing-queue-filter-selector` may be: .. program-output:: python ../examples/python/sycl_queue.py -r create_queue_from_filter_selector From 468070f71501758c916d7da08f3704fffec24441 Mon Sep 17 00:00:00 2001 From: Oleksandr Pavlyk Date: Sat, 30 Apr 2022 14:09:03 -0500 Subject: [PATCH 5/7] corrected referenced line-nums in the example --- docs/docfiles/user_guides/manual/dpctl/queues.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/docfiles/user_guides/manual/dpctl/queues.rst b/docs/docfiles/user_guides/manual/dpctl/queues.rst index fa47b68739..e1c841549a 100644 --- a/docs/docfiles/user_guides/manual/dpctl/queues.rst +++ b/docs/docfiles/user_guides/manual/dpctl/queues.rst @@ -45,7 +45,7 @@ the associated task completes. .. literalinclude:: ../../../../../examples/python/sycl_queue.py :language: python - :lines: 17-19, 67-79 + :lines: 17-19, 72-89 :caption: Constructing SyclQueue from context and device :linenos: From 2d5dc58688f10e3c8839511698118c2cb09099f3 Mon Sep 17 00:00:00 2001 From: Oleksandr Pavlyk Date: Fri, 29 Apr 2022 12:23:42 -0500 Subject: [PATCH 6/7] filled out profiling section --- docs/docfiles/urls.json | 1 + .../user_guides/manual/dpctl/queues.rst | 27 +++++++++++++++++++ 2 files changed, 28 insertions(+) diff --git a/docs/docfiles/urls.json b/docs/docfiles/urls.json index 9c4c0a14d4..0e7fa81c40 100644 --- a/docs/docfiles/urls.json +++ b/docs/docfiles/urls.json @@ -4,6 +4,7 @@ "oneapi": "https://www.oneapi.io/", "oneapi_filter_selection": "https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/supported/sycl_ext_oneapi_filter_selector.asciidoc", "oneapi_default_context": "https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/supported/sycl_ext_oneapi_default_context.asciidoc", + "oneapi_enqueue_barrier": "https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/supported/sycl_ext_oneapi_enqueue_barrier.asciidoc", "sycl_aspects": "https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#table.device.aspect", "sycl_context": "https://sycl.readthedocs.io/en/latest/iface/context.html", "sycl_device": "https://sycl.readthedocs.io/en/latest/iface/device.html", diff --git a/docs/docfiles/user_guides/manual/dpctl/queues.rst b/docs/docfiles/user_guides/manual/dpctl/queues.rst index e1c841549a..54b432ff3a 100644 --- a/docs/docfiles/user_guides/manual/dpctl/queues.rst +++ b/docs/docfiles/user_guides/manual/dpctl/queues.rst @@ -75,3 +75,30 @@ A possible output for the example :ref:`fig-constructing-queue-filter-selector` Profiling a Task Submitted to a Queue ------------------------------------- + +The result of scheduling execution of a task on queue is an event which can queried for the status +of the task execution, can be used to order execution of the future tasks after this one is complete, +can be used to wait for the completion of the task, and can carry profiling information of the task +execution. The profiling information is only populated if the queue used was created with +"enable_profiling" property and only becomes available after the task execution is complete. + +The class :class:`dpctl.SyclTimer` implements a context manager that can be used to collect cumulative +profiling information for all the tasks submitted to the queue of interest by functions executed +within the context: + +.. code-block:: python + :caption: Example of timing execution + + import dpctl + import dpctl.tensor as dpt + + q = dpctl.SyclQueue(property="enable_profiling") + timer_ctx = dpctl.SyclTimer() + with timer_ctx(q): + X = dpt.arange(10**6, dtype=float, sycl_queue=q) + + host_dt, device_dt = timer_ctx.dt + +The timer leverages :oneapi_enqueue_barrier:`oneAPI enqueue_barrier SYCL extension <>` and submits +a barrier at context entrance and a barrier at context exit and records associated events. The elapsed +device time is computed as ``e_exit.profiling_info_start - e_enter.profiling_info_end``. From 9c32d8c2fffa5d776e096d8ed088bf22b4732112 Mon Sep 17 00:00:00 2001 From: Diptorup Deb Date: Mon, 2 May 2022 14:24:20 -0500 Subject: [PATCH 7/7] Edits to the queue user guide. --- .../user_guides/manual/dpctl/queues.rst | 96 ++++++++++--------- 1 file changed, 49 insertions(+), 47 deletions(-) diff --git a/docs/docfiles/user_guides/manual/dpctl/queues.rst b/docs/docfiles/user_guides/manual/dpctl/queues.rst index 54b432ff3a..6afce0ec71 100644 --- a/docs/docfiles/user_guides/manual/dpctl/queues.rst +++ b/docs/docfiles/user_guides/manual/dpctl/queues.rst @@ -4,42 +4,41 @@ Queue ##### -A queue is needed to schedule execution of any computation, or data -copying on the device. Queue construction requires specifying a device -and a context targeting that device as well as additional properties, -such as whether profiling information should be collected or whether -submitted tasks are executed in the order in which they were submitted. +A queue is needed to schedule execution of any computation or data copying on a +device. Queue construction requires specifying a device and a context targeting +that device as well as additional properties, such as whether profiling +information should be collected or whether submitted tasks are executed in the +order in which they were submitted. -The :class:`dpctl.SyclQueue` class represents a queue and abstracts -the :sycl_queue:`sycl::queue <>` SYCL runtime class. +The :class:`dpctl.SyclQueue` class represents a queue and abstracts the +:sycl_queue:`sycl::queue <>` SYCL runtime class. Types of Queues --------------- -In SYCL tasks are submitted for execution by the SYCL runtime. The order -in which runtime executes them on the target device is specified by -a sequence of events which must be complete before execution is allowed. -Submission of a task returns an event which can be used to further grow -the graph of computational tasks. SYCL queue stores data needed to manage -this scheduling operations. +SYCL has a task-based execution model. The order in which a SYCL runtime +executes a task on a target device is specified by a sequence of events which +must be complete before execution of the task is allowed. Submission of a task +returns an event which can be used to further grow the graph of computational +tasks. A SYCL queue stores the needed data to manage the scheduling operations. -Unless specified otherwise during constriction of a queue, SYCL runtime -executes tasks whose dependencies were met in an unspecified order, -with possibility for some of the tasks be execute concurrently. Such -queues are said to be 'out-of-order'. +Unless specified otherwise during constriction of a queue, a SYCL runtime +executes tasks whose dependencies were met in an unspecified order, with +possibility for some of the tasks to be execute concurrently. Such queues are +said to be 'out-of-order'. -SYCL queues can be specified to indicate that runtime must execute tasks -in the linear order in which they were submitted. In this case tasks submitted -to such a queue, called 'in-order' queues, are never executed concurrently. +SYCL queues can be specified to indicate that runtime must execute tasks in the +order in which they were submitted. In this case, tasks submitted to such a +queue, called 'in-order' queues, are never executed concurrently. Creating a New Queue -------------------- -:class:`dpctl.SyclQueue(ctx, dev, property=None)` creates a new instance for -the given compatible context and device. Keyword parameter `property` -can be set to `"in_order"` to create an 'in-order' queue, to `"enable_profiling"` -to dynamically collect task execution statistics in the returned event once -the associated task completes. +:class:`dpctl.SyclQueue(ctx, dev, property=None)` creates a new queue instance +for the given compatible context and device. Keyword parameter `property` can be +set to `"in_order"` to create an 'in-order' queue and to `"enable_profiling"` to +dynamically collect task execution statistics in the returned event once the +associated task completes. .. _fig-constructing-queue-context-device-property: @@ -49,15 +48,16 @@ the associated task completes. :caption: Constructing SyclQueue from context and device :linenos: -A possible output for the example :ref:`fig-constructing-queue-context-device-property:` may be: +A possible output for the example +:ref:`fig-constructing-queue-context-device-property:` may be: .. program-output:: python ../examples/python/sycl_queue.py -r create_queue_from_subdevice_multidevice_context -When context is not specified the :sycl_queue:`sycl::queue <>` constructor +When a context is not specified the :sycl_queue:`sycl::queue <>` constructor from a device instance is called. Instead of an instance of -:class:`dpctl.SyclDevice` the argument `dev` can be a valid filter -selector string. In this case the :sycl_queue:`sycl::queue <>` constructor -with the corresponding :oneapi_filter_selection:`sycl::ext::oneapi::filter_selector <>` +:class:`dpctl.SyclDevice` the argument `dev` can be a valid filter selector +string. In this case, the :sycl_queue:`sycl::queue <>` constructor with the +corresponding :oneapi_filter_selection:`sycl::ext::oneapi::filter_selector <>` is called. .. _fig-constructing-queue-filter-selector: @@ -68,7 +68,8 @@ is called. :caption: Constructing SyclQueue from filter selector :linenos: -A possible output for the example :ref:`fig-constructing-queue-filter-selector` may be: +A possible output for the example :ref:`fig-constructing-queue-filter-selector` +may be: .. program-output:: python ../examples/python/sycl_queue.py -r create_queue_from_filter_selector @@ -76,29 +77,30 @@ A possible output for the example :ref:`fig-constructing-queue-filter-selector` Profiling a Task Submitted to a Queue ------------------------------------- -The result of scheduling execution of a task on queue is an event which can queried for the status -of the task execution, can be used to order execution of the future tasks after this one is complete, -can be used to wait for the completion of the task, and can carry profiling information of the task -execution. The profiling information is only populated if the queue used was created with -"enable_profiling" property and only becomes available after the task execution is complete. +The result of scheduling execution of a task on a queue is an event. An event +has several uses: it can be queried for the status of the task execution, it can +be used to order execution of the future tasks after it is complete, it can be +used to wait for execution to complete, and it can carry information to profile +of the task execution. The profiling information is only populated if the queue +used was created with the "enable_profiling" property and only becomes available +after the task execution is complete. -The class :class:`dpctl.SyclTimer` implements a context manager that can be used to collect cumulative -profiling information for all the tasks submitted to the queue of interest by functions executed -within the context: +The class :class:`dpctl.SyclTimer` implements a Python context manager that can +be used to collect cumulative profiling information for all the tasks submitted +to the queue of interest by functions executed within the context: .. code-block:: python :caption: Example of timing execution - import dpctl - import dpctl.tensor as dpt + import dpctl import dpctl.tensor as dpt - q = dpctl.SyclQueue(property="enable_profiling") - timer_ctx = dpctl.SyclTimer() - with timer_ctx(q): + q = dpctl.SyclQueue(property="enable_profiling") timer_ctx = + dpctl.SyclTimer() with timer_ctx(q): X = dpt.arange(10**6, dtype=float, sycl_queue=q) host_dt, device_dt = timer_ctx.dt -The timer leverages :oneapi_enqueue_barrier:`oneAPI enqueue_barrier SYCL extension <>` and submits -a barrier at context entrance and a barrier at context exit and records associated events. The elapsed -device time is computed as ``e_exit.profiling_info_start - e_enter.profiling_info_end``. +The timer leverages :oneapi_enqueue_barrier:`oneAPI enqueue_barrier SYCL +extension <>` and submits a barrier at context entrance and a barrier at context +exit and records associated events. The elapsed device time is computed as +``e_exit.profiling_info_start - e_enter.profiling_info_end``.