Skip to content

Commit

Permalink
Merge pull request #880 from IntelPython/fix/dpnp_type
Browse files Browse the repository at this point in the history
Changes to the Numba type to represent dpnp ndarray types.
Various changes to fix GitHub CI workflows. 1b7cbc3
  • Loading branch information
diptorupd committed Jan 22, 2023
1 parent 58eb122 commit 593c7a0
Show file tree
Hide file tree
Showing 277 changed files with 49,739 additions and 0 deletions.
4 changes: 4 additions & 0 deletions 0.20.0dev1/.buildinfo
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 3b8eb7438a46c0558952890f0bd152af
tags: 645f666f9bcd5a90fca523b33c5a78b7
292 changes: 292 additions & 0 deletions 0.20.0dev1/CoreFeatures.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,292 @@


<!DOCTYPE html>
<html class="writer-html5" lang="en" >
<head>
<meta charset="utf-8" />
<meta name="generator" content="Docutils 0.19: https://docutils.sourceforge.io/" />

<meta name="viewport" content="width=device-width, initial-scale=1.0" />

<title>Code-generation based on a device &mdash; numba-dpex 0.20.0dev1 documentation</title>



<link rel="stylesheet" href="_static/css/theme.css" type="text/css" />
<link rel="stylesheet" href="_static/pygments.css" type="text/css" />
<link rel="stylesheet" href="_static/pygments.css" type="text/css" />
<link rel="stylesheet" href="_static/css/theme.css" type="text/css" />










<!--[if lt IE 9]>
<script src="_static/js/html5shiv.min.js"></script>
<![endif]-->


<script type="text/javascript" id="documentation_options" data-url_root="./" src="_static/documentation_options.js"></script>
<script data-url_root="./" id="documentation_options" src="_static/documentation_options.js"></script>
<script src="_static/doctools.js"></script>
<script src="_static/sphinx_highlight.js"></script>

<script type="text/javascript" src="_static/js/theme.js"></script>


<link rel="index" title="Index" href="genindex.html" />
<link rel="search" title="Search" href="search.html" />
<link rel="next" title="Getting Started" href="user_guides/getting_started.html" />
<link rel="prev" title="Welcome to numba-dpex’s documentation!" href="index.html" />
</head>

<body class="wy-body-for-nav">


<div class="wy-grid-for-nav">

<nav data-toggle="wy-nav-shift" class="wy-nav-side">
<div class="wy-side-scroll">
<div class="wy-side-nav-search" >



<a href="index.html" class="icon icon-home"> numba-dpex



</a>







<div role="search">
<form id="rtd-search-form" class="wy-form" action="search.html" method="get">
<input type="text" name="q" placeholder="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>


</div>


<div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">






<p class="caption" role="heading"><span class="caption-text">Core Features</span></p>
<ul class="current">
<li class="toctree-l1 current"><a class="current reference internal" href="#">Code-generation based on a device</a></li>
<li class="toctree-l1"><a class="reference internal" href="#automatic-offload-of-numpy-expressions">Automatic offload of NumPy expressions</a><ul>
<li class="toctree-l2"><a class="reference internal" href="#controllable-fallback">Controllable Fallback</a></li>
<li class="toctree-l2"><a class="reference internal" href="#offload-diagnostics">Offload Diagnostics</a></li>
</ul>
</li>
</ul>
<p class="caption" role="heading"><span class="caption-text">User Guides</span></p>
<ul>
<li class="toctree-l1"><a class="reference internal" href="user_guides/getting_started.html"> Getting Started</a></li>
<li class="toctree-l1"><a class="reference internal" href="user_guides/kernel_programming_guide/index.html"> Direct kernel programming</a></li>
<li class="toctree-l1"><a class="reference internal" href="user_guides/debugging/index.html"> Debugging with GDB</a></li>
<li class="toctree-l1"><a class="reference internal" href="user_guides/migrating_from_numba_cuda.html"> numba-dpex for numba.cuda Programmers</a></li>
</ul>
<p class="caption" role="heading"><span class="caption-text">Developer Guides</span></p>
<ul>
<li class="toctree-l1"><a class="reference internal" href="developer_guides/dpnp_integration.html">dpnp integration</a></li>
<li class="toctree-l1"><a class="reference internal" href="developer_guides/tools.html">Debugging the compilation pipeline</a></li>
</ul>



</div>

</div>
</nav>

<section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">


<nav class="wy-nav-top" aria-label="top navigation">

<i data-toggle="wy-nav-top" class="fa fa-bars"></i>
<a href="index.html">numba-dpex</a>

</nav>


<div class="wy-nav-content">

<div class="rst-content">



















<div role="navigation" aria-label="breadcrumbs navigation">

<ul class="wy-breadcrumbs">

<li><a href="index.html" class="icon icon-home"></a> &raquo;</li>

<li>Code-generation based on a device</li>


<li class="wy-breadcrumbs-aside">


<a href="_sources/CoreFeatures.rst.txt" rel="nofollow"> View page source</a>


</li>

</ul>


<hr/>
</div>
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">

<section id="code-generation-based-on-a-device">
<span id="core-features"></span><h1>Code-generation based on a device<a class="headerlink" href="#code-generation-based-on-a-device" title="Permalink to this heading"></a></h1>
<p>In numba-dpex, kernels are written in a device-agnostic fashion making it
easy to write portable code. A kernel is compiled for the device on which the
kernel is enqueued to be executed. The device is specified using a
<code class="docutils literal notranslate"><span class="pre">dpctl.device_context</span></code> context manager. In the following example, two versions
of the <code class="docutils literal notranslate"><span class="pre">sum</span></code> kernel are compiled, one for a GPU and another for a CPU based on
which context the function was invoked. Numba-dpex supports
OpenCL CPU and GPU devices and Level Zero GPU devices. In future, compilation
support may be extended to other type of SYCL devices that are supported by
DPC++’s runtime.</p>
<blockquote>
<div><div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
<span class="kn">import</span> <span class="nn">numba_dpex</span> <span class="k">as</span> <span class="nn">dpex</span>
<span class="kn">import</span> <span class="nn">dpctl</span>


<span class="nd">@dpex</span><span class="o">.</span><span class="n">kernel</span>
<span class="k">def</span> <span class="nf">sum</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">):</span>
<span class="n">i</span> <span class="o">=</span> <span class="n">dpex</span><span class="o">.</span><span class="n">get_global_id</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
<span class="n">c</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">a</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">+</span> <span class="n">b</span><span class="p">[</span><span class="n">i</span><span class="p">]</span>


<span class="n">a</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">random</span><span class="p">(</span><span class="mi">20</span><span class="p">),</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">float32</span><span class="p">)</span>
<span class="n">b</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">random</span><span class="p">(</span><span class="mi">20</span><span class="p">),</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">float32</span><span class="p">)</span>
<span class="n">c</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">ones_like</span><span class="p">(</span><span class="n">a</span><span class="p">)</span>

<span class="k">with</span> <span class="n">dpctl</span><span class="o">.</span><span class="n">device_context</span><span class="p">(</span><span class="s2">&quot;level_zero:gpu&quot;</span><span class="p">):</span>
<span class="nb">sum</span><span class="p">[</span><span class="mi">20</span><span class="p">,</span> <span class="n">dpex</span><span class="o">.</span><span class="n">DEFAULT_LOCAL_SIZE</span><span class="p">](</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span>

<span class="k">with</span> <span class="n">dpctl</span><span class="o">.</span><span class="n">device_context</span><span class="p">(</span><span class="s2">&quot;opencl:cpu&quot;</span><span class="p">):</span>
<span class="nb">sum</span><span class="p">[</span><span class="mi">20</span><span class="p">,</span> <span class="n">dpex</span><span class="o">.</span><span class="n">DEFAULT_LOCAL_SIZE</span><span class="p">](</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span>
</pre></div>
</div>
</div></blockquote>
</section>
<section id="automatic-offload-of-numpy-expressions">
<h1>Automatic offload of NumPy expressions<a class="headerlink" href="#automatic-offload-of-numpy-expressions" title="Permalink to this heading"></a></h1>
<p>A key distinction between numba-dpex and other the GPU backends in Numba is
the ability to automatically offload specific data-parallel sections of a
Numba <code class="docutils literal notranslate"><span class="pre">jit</span></code> function.</p>
<div class="admonition-todo admonition" id="id1">
<p class="admonition-title">Todo</p>
<p>Details and examples to be added.</p>
</div>
<section id="controllable-fallback">
<h2>Controllable Fallback<a class="headerlink" href="#controllable-fallback" title="Permalink to this heading"></a></h2>
<p>By default, if a section of code cannot be offloaded to the GPU, it is
automatically executed on the CPU and warning is printed. This behavior is only
applicable to <code class="docutils literal notranslate"><span class="pre">jit</span></code> functions, auto-offloading of NumPy calls, array
expressions and <code class="docutils literal notranslate"><span class="pre">prange</span></code> loops. To disable this functionality and force code
running on GPU set the environment variable <code class="docutils literal notranslate"><span class="pre">NUMBA_DPEX_FALLBACK_ON_CPU</span></code> to
false (e.g. <code class="docutils literal notranslate"><span class="pre">export</span> <span class="pre">NUMBA_DPEX_FALLBACK_ON_CPU=0</span></code>). In this case the code is
not automatically offloaded to the CPU and errors occur if any.</p>
</section>
<section id="offload-diagnostics">
<h2>Offload Diagnostics<a class="headerlink" href="#offload-diagnostics" title="Permalink to this heading"></a></h2>
<p>Setting the debug environment variable <code class="docutils literal notranslate"><span class="pre">NUMBA_DPEX_OFFLOAD_DIAGNOSTICS</span></code> (e.g.
<code class="docutils literal notranslate"><span class="pre">export</span> <span class="pre">NUMBA_DPEX_OFFLOAD_DIAGNOSTICS=1</span></code>) provides emission of the parallel
and offload diagnostics information based on produced parallel transforms. The
level of detail depends on the integer value between 1 and 4 that is set to the
environment variable (higher is more detailed). In the “Auto-offloading” section
there is the information on which device (device name) this parfor or kernel was
offloaded.</p>
</section>
</section>


</div>

</div>
<footer>
<div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
<a href="user_guides/getting_started.html" class="btn btn-neutral float-right" title="Getting Started" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
<a href="index.html" class="btn btn-neutral float-left" title="Welcome to numba-dpex’s documentation!" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
</div>

<hr/>

<div role="contentinfo">
<p>
&#169; Copyright 2021,2022 Intel.

</p>
</div>



Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a

<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>

provided by <a href="https://readthedocs.org">Read the Docs</a>.

</footer>
</div>
</div>

</section>

</div>


<script type="text/javascript">
jQuery(function () {
SphinxRtdTheme.Navigation.enable(true);
});
</script>






</body>
</html>
70 changes: 70 additions & 0 deletions 0.20.0dev1/_sources/CoreFeatures.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
.. _core_features:

Code-generation based on a device
=================================

In numba-dpex, kernels are written in a device-agnostic fashion making it
easy to write portable code. A kernel is compiled for the device on which the
kernel is enqueued to be executed. The device is specified using a
``dpctl.device_context`` context manager. In the following example, two versions
of the ``sum`` kernel are compiled, one for a GPU and another for a CPU based on
which context the function was invoked. Numba-dpex supports
OpenCL CPU and GPU devices and Level Zero GPU devices. In future, compilation
support may be extended to other type of SYCL devices that are supported by
DPC++'s runtime.

.. code-block:: python
import numpy as np
import numba_dpex as dpex
import dpctl
@dpex.kernel
def sum(a, b, c):
i = dpex.get_global_id(0)
c[i] = a[i] + b[i]
a = np.array(np.random.random(20), dtype=np.float32)
b = np.array(np.random.random(20), dtype=np.float32)
c = np.ones_like(a)
with dpctl.device_context("level_zero:gpu"):
sum[20, dpex.DEFAULT_LOCAL_SIZE](a, b, c)
with dpctl.device_context("opencl:cpu"):
sum[20, dpex.DEFAULT_LOCAL_SIZE](a, b, c)
Automatic offload of NumPy expressions
======================================

A key distinction between numba-dpex and other the GPU backends in Numba is
the ability to automatically offload specific data-parallel sections of a
Numba ``jit`` function.

.. todo::

Details and examples to be added.

Controllable Fallback
---------------------

By default, if a section of code cannot be offloaded to the GPU, it is
automatically executed on the CPU and warning is printed. This behavior is only
applicable to ``jit`` functions, auto-offloading of NumPy calls, array
expressions and ``prange`` loops. To disable this functionality and force code
running on GPU set the environment variable ``NUMBA_DPEX_FALLBACK_ON_CPU`` to
false (e.g. ``export NUMBA_DPEX_FALLBACK_ON_CPU=0``). In this case the code is
not automatically offloaded to the CPU and errors occur if any.

Offload Diagnostics
-------------------

Setting the debug environment variable ``NUMBA_DPEX_OFFLOAD_DIAGNOSTICS`` (e.g.
``export NUMBA_DPEX_OFFLOAD_DIAGNOSTICS=1``) provides emission of the parallel
and offload diagnostics information based on produced parallel transforms. The
level of detail depends on the integer value between 1 and 4 that is set to the
environment variable (higher is more detailed). In the "Auto-offloading" section
there is the information on which device (device name) this parfor or kernel was
offloaded.
7 changes: 7 additions & 0 deletions 0.20.0dev1/_sources/apidoc/modules.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
numba_dpex
==========

.. toctree::
:maxdepth: 4

numba_dpex
7 changes: 7 additions & 0 deletions 0.20.0dev1/_sources/apidoc/numba_dpex.codegen.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
numba\_dpex.codegen module
==========================

.. automodule:: numba_dpex.codegen
:members:
:undoc-members:
:show-inheritance:
7 changes: 7 additions & 0 deletions 0.20.0dev1/_sources/apidoc/numba_dpex.config.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
numba\_dpex.config module
=========================

.. automodule:: numba_dpex.config
:members:
:undoc-members:
:show-inheritance:
Loading

0 comments on commit 593c7a0

Please sign in to comment.