Skip to content

Commit

Permalink
Edits to the overview.rst
Browse files Browse the repository at this point in the history
  • Loading branch information
Diptorup Deb committed Jul 19, 2023
1 parent 13f26b7 commit e1cc8e2
Show file tree
Hide file tree
Showing 3 changed files with 36 additions and 30 deletions.
1 change: 1 addition & 0 deletions docs/source/ext_links.txt
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,4 @@
.. _Data Parallel Extensions for Python*: https://intelpython.github.io/DPEP/main/
.. _Intel VTune Profiler: https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html
.. _Intel Advisor: https://www.intel.com/content/www/us/en/developer/tools/oneapi/advisor.html
.. _oneMKL: https://www.intel.com/content/www/us/en/docs/oneapi/programming-guide/2023-2/intel-oneapi-math-kernel-library-onemkl.html
52 changes: 27 additions & 25 deletions docs/source/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,23 +15,23 @@ implementation of `NumPy*`_'s API using the `SYCL*`_ language.
.. the same time automatically running such code parallelly on various types of
.. architecture.
``numba-dpex`` is developed as part of `Intel AI Analytics Toolkit`_ and
is distributed with the `Intel Distribution for Python*`_. The extension is
available on Anaconda cloud and as a Docker image on GitHub. Please refer the
:doc:`getting_started` page to learn more.
``numba-dpex`` is an open-source project and can be installed as part of `Intel
AI Analytics Toolkit`_ or the `Intel Distribution for Python*`_. The package is
also available on Anaconda cloud and as a Docker image on GitHub. Please refer
the :doc:`getting_started` page to learn more.

Main Features
-------------

Portable Kernel Programming
~~~~~~~~~~~~~~~~~~~~~~~~~~~

The ``numba-dpex`` kernel API has a design and API similar to Numba's
The ``numba-dpex`` kernel programming API has a design similar to Numba's
``cuda.jit`` sub-module. The API is modeled after the `SYCL*`_ language and uses
the `DPC++`_ SYCL runtime. Currently, compilation of kernels is supported for
SPIR-V-based OpenCL and `oneAPI Level Zero`_ devices CPU and GPU devices. In the
future, the API can be extended to other architectures that are supported by
DPC++.
future, compilation support for other types of hardware that are supported by
DPC++ will be added.

The following example illustrates a vector addition kernel written with
``numba-dpex`` kernel API.
Expand All @@ -56,31 +56,33 @@ The following example illustrates a vector addition kernel written with
print(c)
In the above example, three arrays are allocated on a default ``gpu`` device
using the ``dpnp`` library. These arrays are then passed as input arguments to
the kernel function. The compilation target and the subsequent execution of the
kernel is determined completely by the input arguments and follow the
using the ``dpnp`` library. The arrays are then passed as input arguments to the
kernel function. The compilation target and the subsequent execution of the
kernel is determined by the input arguments and follow the
"compute-follows-data" programming model as specified in the `Python* Array API
Standard`_. To change the execution target to a CPU, the device keyword needs to
be changed to ``cpu`` when allocating the ``dpnp`` arrays. It is also possible
to leave the ``device`` keyword undefined and let the ``dpnp`` library select a
default device based on environment flag settings. Refer the
:doc:`user_guide/kernel_programming/index` for further details.

``dpnp`` compilation support
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

``numba-dpex`` extends Numba's type system and compilation pipeline to compile
``dpnp`` functions and expressions in the same way as NumPy. Unlike Numba's
NumPy compilation that is serial by default, ``numba-dpex`` always compiles
``dpnp`` expressions into data-parallel kernels and executes them in parallel.
The ``dpnp`` compilation feature is provided using a decorator ``dpjit`` that
behaves identically to ``numba.njit(parallel=True)`` with the addition of
``dpnp`` compilation and kernel offloading. Offloading by ``numba-dpex`` is not
just restricted to CPUs and supports all devices that are presently supported by
the kernel API. ``dpjit`` allows using NumPy and ``dpnp`` expressions in the
same function. All NumPy compilation and parallelization is done via the default
Numba code-generation pipeline, whereas ``dpnp`` expressions are compiled using
the ``numba-dpex`` pipeline.
``dpjit`` decorator
~~~~~~~~~~~~~~~~~~~

The ``numba-dpex`` package provides a new decorator ``dpjit`` that extends
Numba's ``njit`` decorator. The new decorator is equivalent to
``numba.njit(parallel=True)``, but additionally supports compiling ``dpnp``
functions, ``prange`` loops, and array expressions that use ``dpnp.ndarray``
objects.

Unlike Numba's NumPy parallelization that only supports CPUs, ``dpnp``
expressions are first converted to data-parallel kernels and can then be
`offloaded` to different types of devices. As ``dpnp`` implements the same API
as NumPy*, an existing ``numba.njit`` decorated function that uses
``numpy.ndarray`` may be refactored to use ``dpnp.ndarray`` and decorated with
``dpjit``. Such a refactoring can allow the parallel regions to be offloaded
to a supported GPU device, providing users an additional option to execute their
code parallelly.

The vector addition example depicted using the kernel API can also be
expressed in several different ways using ``dpjit``.
Expand Down
13 changes: 8 additions & 5 deletions docs/source/user_guide/dpnp_offload.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,14 @@
Compiling and Offloading ``dpnp`` Functions
===========================================

Data-Parallel Numeric Python (``dpnp``) is a drop-in ``NumPy*`` replacement library. The
library is developed using SYCL and oneMKL. ``numba-dpex`` relies on ``dpnp`` to
support offloading ``NumPy`` library functions to SYCL devices. For ``NumPy`` functions
that are offloaded using ``dpnp``, ``numba-dpex`` generates library calls directly to
``dpnp``'s `low-level API`_ inside the generated LLVM IR.
Data Parallel Extension for NumPy* (``dpnp``) is a drop-in ``NumPy*``
replacement library built on top of oneMKL.


``numba-dpex`` relies on ``dpnp`` to
support offloading ``NumPy`` library functions to SYCL devices. For ``NumPy``
functions that are offloaded using ``dpnp``, ``numba-dpex`` generates library
calls directly to ``dpnp``'s `low-level API`_ inside the generated LLVM IR.

.. _low-level API: https://github.com/IntelPython/dpnp/tree/master/dpnp/backend

Expand Down

0 comments on commit e1cc8e2

Please sign in to comment.