Skip to content

Commit

Permalink
Updates to overview
Browse files Browse the repository at this point in the history
  • Loading branch information
Diptorup Deb committed May 30, 2023
1 parent da5ed15 commit ec660c3
Show file tree
Hide file tree
Showing 3 changed files with 100 additions and 35 deletions.
21 changes: 14 additions & 7 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,23 @@ Welcome to numba-dpex's documentation!
======================================

Numba data-parallel extension (`numba-dpex
<https://github.com/IntelPython/numba-dpex>`_) is an Intel |reg|-developed
<https://github.com/IntelPython/numba-dpex>`_) is a standalone
extension to the `Numba <https://numba.pydata.org/>`_ JIT compiler. The
extension adds kernel programming and automatic offload capabilities to the
Numba compiler. Numba-dpex is part of `Intel oneAPI Base Toolkit
<https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit.html>`_
extension adds two features to Numba: an architecture-agnostic kernel
programming API, and a backend for Numba's `jit` decorator that can parallelize
NumPy-like array expressions and function calls on different data-parallel
architectures. The parallelization feature for a NumPy-like API is provided by
adding type and compilation support for the
`dpnp <https://github.com/IntelPython/dpnp>`_ library, a data-parallel NumPy
drop-in replacement library.

Numba-dpex is part of `Intel oneAPI AI Kit
<https://www.intel.com/content/www/us/en/developer/tools/oneapi/ai-analytics-toolkit.html>`_
and distributed with the `Intel Distribution for Python*
<https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-for-python.html>`_.
The goal of the extension is to make it easy for Python programmers to
write efficient and portable code for a mix of architectures across CPUs, GPUs,
FPGAs and other accelerators.
The extension is also available on Anaconda cloud and as a Docker image on
GitHUb. Please refer the `Getting Started <user_guides/getting_started>`_ to
learn more.

Numba-dpex provides an API to write data-parallel kernels directly in Python and
compiles the kernels to a lower-level kernels that are executed using a `SYCL
Expand Down
6 changes: 5 additions & 1 deletion docs/sources/ext_links.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,22 @@
**********************************************************
THESE ARE EXTERNAL PROJECT LINKS USED IN THE DOCUMENTATION
**********************************************************

.. _NumPy*: https://numpy.org/
.. _Numba*: https://numba.pydata.org/
.. _numba-dpex: https://github.com/IntelPython/numba-dpex
.. _Python* Array API Standard: https://data-apis.org/array-api/
.. _Intel Distribution for Python*: https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-for-python.html
.. _OpenCl*: https://www.khronos.org/opencl/
.. _oneAPI Level Zero: https://spec.oneapi.io/level-zero/latest/index.html
.. _DPC++: https://www.apress.com/gp/book/9781484255735
.. _Data Parallel Extension for Numba*: https://intelpython.github.io/numba-dpex/latest/index.html
.. _SYCL*: https://www.khronos.org/sycl/
.. _Data Parallel Control: https://intelpython.github.io/dpctl/latest/index.html
.. _Data Parallel Extension for Numpy*: https://intelpython.github.io/dpnp/
.. _IEEE 754-2019 Standard for Floating-Point Arithmetic: https://standards.ieee.org/ieee/754/6210/
.. _Intel oneAPI Base Toolkit: https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit.html
.. _Intel Distribution for Python*: https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-for-python.html
.. _Intel AI Analytics Toolkit: https://www.intel.com/content/www/us/en/developer/tools/oneapi/ai-analytics-toolkit.html
.. _Data Parallel Extensions for Python*: https://intelpython.github.io/DPEP/main/
.. _Intel VTune Profiler: https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html
.. _Intel Advisor: https://www.intel.com/content/www/us/en/developer/tools/oneapi/advisor.html
108 changes: 81 additions & 27 deletions docs/sources/overview.rst
Original file line number Diff line number Diff line change
@@ -1,44 +1,98 @@
.. _overview
.. include:: ./ext_links.txt

Overview
========

Data-Parallel Extensions for Numba* (`numba-dpex`_) is a standalone extension to
the `Numba`_ JIT compiler. The extension adds two new features to Numba: an
architecture-agnostic kernel programming API, and a backend extension that can
parallelize NumPy-style array expressions and function calls on different types
of data-parallel architectures.

Numba data-parallel extension (`numba-dpex
<https://github.com/IntelPython/numba-dpex>`_) is an Intel |reg|-developed
extension to the `Numba <https://numba.pydata.org/>`_ JIT compiler.
Numba-dpex is part of `Intel AI Analytics Toolkit`_ and distributed with the
`Intel Distribution for Python*`_. The extension is also available on Anaconda
cloud and as a Docker image on GitHub. Please refer the :doc:`getting_started`
page to learn more.

Numba-dpex extends Numba* by adding a kernel programming API based on `SYCL
<https://www.khronos.org/sycl/>`_ and compilation support for
Data-parallel Extension For NumPy*
(`dpnp <https://github.com/IntelPython/dpnp>`_) a drop-in replacement for
NumPy* based on SYCL.
Main Features
-------------

- :doc:`user_manual/kernel_programming/index`

The kernel API has a design and API similar to what is provided by Numba's
``cuda.jit`` module. However, the API uses the `SYCL`_ language runtime and
as such is extensible to several hardware categories. Presently, the API
supports only SPIR-V-based OpenCL and `oneAPI Level Zero`_ devices that are
supported by the Intel® `DPC++`_ SYCL compiler runtime.

The
extension adds kernel programming and automatic offload capabilities to the
Numba compiler. Numba-dpex is part of `Intel oneAPI Base Toolkit
<https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit.html>`_
and distributed with the `Intel Distribution for Python*
<https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-for-python.html>`_.
The goal of the extension is to make it easy for Python programmers to
write efficient and portable code for a mix of architectures across CPUs, GPUs,
FPGAs and other accelerators.
A simple vector addition kernel can be expressed using the API as follows:

.. code-block:: python
import dpnp
import numba_dpex as dpex
@dpex.kernel
def sum(a, b, c):
i = dpex.get_global_id(0)
c[i] = a[i] + b[i]
a = dpnp.ones(1024, device="gpu")
b = dpnp.ones(1024, device="gpu")
c = dpnp.empty_like(a)
sum[dpex.Range(1024)](a, b, c)
print(c)
In the above example, as the programmer allocated the dpnp arrays on a
default ``gpu`` device, numba-dpex will compile and then execute the kernel
for that device. To change the execution target to a CPU, the device only
the device keyword needs to be changed to ``cpu`` when allocating the dpnp
arrays.

- :doc: `user_manual/auto-offload`

The new backend extension to add automatic parallelization support has a
similar user-interface to Numba's existing loop-parallelizer. The feature
enables a programmer to "offload" NumPy-style vector expressions, library
calls, and ``prange`` loops to different hardware and execute them in
parallel. A key difference from Numba's loop-parallelizer is the ability to
parallelize on non-multicore CPU hardware.

requires the
`dpnp <https://github.com/IntelPython/dpnp>`_ library, a data-parallel
drop-in replacement for `NumPy*`_.


A programmer only needs to swap NumPy*
function calls, array expressions, and loops with the corresponding API and
array type from dpnp and use numba-dpex's decorator in place of the default
Numba decorator to parallelize the expressions on different types of
hardware.

Contributing
============

Refer the `contributing guide
<https://github.com/IntelPython/numba-dpex/blob/main/CONTRIBUTING>`_ for
information on coding style and standards used in numba-dpex.

License
=======

Numba-dpex is Licensed under Apache License 2.0 that can be found in `LICENSE
<https://github.com/IntelPython/numba-dpex/blob/main/LICENSE>`_. All usage and
contributions to the project are subject to the terms and conditions of this
license.

Numba-dpex provides an API to write data-parallel kernels directly in Python and
compiles the kernels to a lower-level kernels that are executed using a `SYCL
<https://www.khronos.org/sycl/>`_ runtime library. Presently, only Intel's
`DPC++ <https://github.com/intel/llvm/blob/sycl/sycl/doc/GetStartedGuide.md>`_
SYCL runtime is supported via the `dpctl
<https://github.com/IntelPython/dpctl>`_ package, and only OpenCL and Level Zero
devices are supported. Support for other SYCL runtime libraries and hardwares
may be added in the future.

Along with the kernel programming API an auto-offload feature is also provided.
The feature enables automatic generation of kernels from data-parallel NumPy
library calls and array expressions, Numba ``prange`` loops, and `other
"data-parallel by construction" expressions
<https://numba.pydata.org/numba-doc/latest/user/parallel.html>`_ that Numba is
able to parallelize. Following two examples demonstrate the two ways in
which kernels may be written using numba-dpex.
able to parallelize. Following two examples demonstrate the two ways in which
kernels may be written using numba-dpex.

0 comments on commit ec660c3

Please sign in to comment.