-
Notifications
You must be signed in to change notification settings - Fork 33
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Diptorup Deb
committed
May 30, 2023
1 parent
da5ed15
commit ec660c3
Showing
3 changed files
with
100 additions
and
35 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,44 +1,98 @@ | ||
.. _overview | ||
.. include:: ./ext_links.txt | ||
|
||
Overview | ||
======== | ||
|
||
Data-Parallel Extensions for Numba* (`numba-dpex`_) is a standalone extension to | ||
the `Numba`_ JIT compiler. The extension adds two new features to Numba: an | ||
architecture-agnostic kernel programming API, and a backend extension that can | ||
parallelize NumPy-style array expressions and function calls on different types | ||
of data-parallel architectures. | ||
|
||
Numba data-parallel extension (`numba-dpex | ||
<https://github.com/IntelPython/numba-dpex>`_) is an Intel |reg|-developed | ||
extension to the `Numba <https://numba.pydata.org/>`_ JIT compiler. | ||
Numba-dpex is part of `Intel AI Analytics Toolkit`_ and distributed with the | ||
`Intel Distribution for Python*`_. The extension is also available on Anaconda | ||
cloud and as a Docker image on GitHub. Please refer the :doc:`getting_started` | ||
page to learn more. | ||
|
||
Numba-dpex extends Numba* by adding a kernel programming API based on `SYCL | ||
<https://www.khronos.org/sycl/>`_ and compilation support for | ||
Data-parallel Extension For NumPy* | ||
(`dpnp <https://github.com/IntelPython/dpnp>`_) a drop-in replacement for | ||
NumPy* based on SYCL. | ||
Main Features | ||
------------- | ||
|
||
- :doc:`user_manual/kernel_programming/index` | ||
|
||
The kernel API has a design and API similar to what is provided by Numba's | ||
``cuda.jit`` module. However, the API uses the `SYCL`_ language runtime and | ||
as such is extensible to several hardware categories. Presently, the API | ||
supports only SPIR-V-based OpenCL and `oneAPI Level Zero`_ devices that are | ||
supported by the Intel® `DPC++`_ SYCL compiler runtime. | ||
|
||
The | ||
extension adds kernel programming and automatic offload capabilities to the | ||
Numba compiler. Numba-dpex is part of `Intel oneAPI Base Toolkit | ||
<https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit.html>`_ | ||
and distributed with the `Intel Distribution for Python* | ||
<https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-for-python.html>`_. | ||
The goal of the extension is to make it easy for Python programmers to | ||
write efficient and portable code for a mix of architectures across CPUs, GPUs, | ||
FPGAs and other accelerators. | ||
A simple vector addition kernel can be expressed using the API as follows: | ||
|
||
.. code-block:: python | ||
import dpnp | ||
import numba_dpex as dpex | ||
@dpex.kernel | ||
def sum(a, b, c): | ||
i = dpex.get_global_id(0) | ||
c[i] = a[i] + b[i] | ||
a = dpnp.ones(1024, device="gpu") | ||
b = dpnp.ones(1024, device="gpu") | ||
c = dpnp.empty_like(a) | ||
sum[dpex.Range(1024)](a, b, c) | ||
print(c) | ||
In the above example, as the programmer allocated the dpnp arrays on a | ||
default ``gpu`` device, numba-dpex will compile and then execute the kernel | ||
for that device. To change the execution target to a CPU, the device only | ||
the device keyword needs to be changed to ``cpu`` when allocating the dpnp | ||
arrays. | ||
|
||
- :doc: `user_manual/auto-offload` | ||
|
||
The new backend extension to add automatic parallelization support has a | ||
similar user-interface to Numba's existing loop-parallelizer. The feature | ||
enables a programmer to "offload" NumPy-style vector expressions, library | ||
calls, and ``prange`` loops to different hardware and execute them in | ||
parallel. A key difference from Numba's loop-parallelizer is the ability to | ||
parallelize on non-multicore CPU hardware. | ||
|
||
requires the | ||
`dpnp <https://github.com/IntelPython/dpnp>`_ library, a data-parallel | ||
drop-in replacement for `NumPy*`_. | ||
|
||
|
||
A programmer only needs to swap NumPy* | ||
function calls, array expressions, and loops with the corresponding API and | ||
array type from dpnp and use numba-dpex's decorator in place of the default | ||
Numba decorator to parallelize the expressions on different types of | ||
hardware. | ||
|
||
Contributing | ||
============ | ||
|
||
Refer the `contributing guide | ||
<https://github.com/IntelPython/numba-dpex/blob/main/CONTRIBUTING>`_ for | ||
information on coding style and standards used in numba-dpex. | ||
|
||
License | ||
======= | ||
|
||
Numba-dpex is Licensed under Apache License 2.0 that can be found in `LICENSE | ||
<https://github.com/IntelPython/numba-dpex/blob/main/LICENSE>`_. All usage and | ||
contributions to the project are subject to the terms and conditions of this | ||
license. | ||
|
||
Numba-dpex provides an API to write data-parallel kernels directly in Python and | ||
compiles the kernels to a lower-level kernels that are executed using a `SYCL | ||
<https://www.khronos.org/sycl/>`_ runtime library. Presently, only Intel's | ||
`DPC++ <https://github.com/intel/llvm/blob/sycl/sycl/doc/GetStartedGuide.md>`_ | ||
SYCL runtime is supported via the `dpctl | ||
<https://github.com/IntelPython/dpctl>`_ package, and only OpenCL and Level Zero | ||
devices are supported. Support for other SYCL runtime libraries and hardwares | ||
may be added in the future. | ||
|
||
Along with the kernel programming API an auto-offload feature is also provided. | ||
The feature enables automatic generation of kernels from data-parallel NumPy | ||
library calls and array expressions, Numba ``prange`` loops, and `other | ||
"data-parallel by construction" expressions | ||
<https://numba.pydata.org/numba-doc/latest/user/parallel.html>`_ that Numba is | ||
able to parallelize. Following two examples demonstrate the two ways in | ||
which kernels may be written using numba-dpex. | ||
able to parallelize. Following two examples demonstrate the two ways in which | ||
kernels may be written using numba-dpex. |