Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor/kernel interfaces #804

Merged
merged 51 commits into from
Jan 18, 2023
Merged

Refactor/kernel interfaces #804

merged 51 commits into from
Jan 18, 2023

Conversation

diptorupd
Copy link
Contributor

@diptorupd diptorupd commented Oct 18, 2022

  • Have you provided a meaningful PR description?

    • The concept of a kernel was decoupled from the notion of dispatching a kernel to a device. The old implementation in numba_dpex\compiler.py mixes both things, making hard the separation of compute-follows-data based kernel launch from legacy dpctl.device_context based behavior.
    • Deprecates support for numpy arrays as kernel args.
    • Deprecates support for dpctl.device_context for kernels.
    • Deprecates support for the square bracket notation using __getitem__ to provide global and local ranges for a kernel launch. (to be reevaluated Deprecate __getitem__ support in numba_dpex.kernel #790)
    • Changes the behavior of specializing a kernel using only a signature. The new way to specialize will require a device type and a backend.
    • Add validations for ndrange kernel launch parameters.
    • Pass llvm bitcode to llvm-spirv using stdin rather than first serializing the bitcode to disk #862
    • Improvements to exception messages using custom exceptions.
    • The new API is now inside numba_dpex.core.kernel_interface.
    • Migrate DpexFunc to new API
    • Migrate parfor compilation to new API
    • Remove numba_dpex.compiler.py
    • Adds new unit test cases
    • Enable caching of kernels in new API
    • Doc strings
    • Update examples
    • Generate API docs
  • Have you added a test, reproducer or referred to an issue with a reproducer?

  • Have you tested your changes locally for CPU and GPU devices?

  • Have you made sure that new changes do not introduce compiler warnings?

  • If this PR is a work in progress, are you filing the PR as a draft?

Fixes #814, #816, #780, #810

@diptorupd diptorupd marked this pull request as draft October 18, 2022 03:59
@diptorupd diptorupd force-pushed the refactor/kernel_interfaces branch from 736bff3 to 6b179af Compare October 18, 2022 04:44
@diptorupd diptorupd added this to the 0.19.0 milestone Oct 18, 2022
@diptorupd diptorupd force-pushed the refactor/kernel_interfaces branch from 6b179af to d426eb2 Compare October 18, 2022 19:45
@diptorupd diptorupd force-pushed the refactor/kernel_interfaces branch 4 times, most recently from f489b60 to 921624e Compare October 23, 2022 18:39
@diptorupd diptorupd mentioned this pull request Oct 23, 2022
12 tasks
@diptorupd diptorupd force-pushed the refactor/kernel_interfaces branch from 2d98ded to b2ed65d Compare October 26, 2022 04:38
@diptorupd diptorupd force-pushed the refactor/kernel_interfaces branch 3 times, most recently from f0e4ed0 to 211d5d8 Compare November 9, 2022 02:03
@diptorupd
Copy link
Contributor Author

@mingjie-intel @chudur-budur All existing tests except caching work with the new API. I have updated the description to capture pending TODOs. Any early feedback will be very helpful.

numba_dpex/core/kernel_interface/spirv_kernel.py Outdated Show resolved Hide resolved
numba_dpex/core/kernel_interface/spirv_kernel.py Outdated Show resolved Hide resolved
numba_dpex/core/kernel_interface/spirv_kernel.py Outdated Show resolved Hide resolved
numba_dpex/core/kernel_interface/spirv_kernel.py Outdated Show resolved Hide resolved
extra_compile_flags=extra_compile_flags,
)

self._target_context = cres.target_context
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@diptorupd Why we are not doing the same as it has been done in _compile()?

@diptorupd diptorupd force-pushed the refactor/kernel_interfaces branch 3 times, most recently from f99df29 to b43201c Compare November 22, 2022 15:45
@diptorupd diptorupd force-pushed the refactor/kernel_interfaces branch 4 times, most recently from 27490c8 to 17676d2 Compare December 7, 2022 17:10
numba_dpex/compiler.py Outdated Show resolved Hide resolved
numba_dpex/core/_compile_helper.py Outdated Show resolved Hide resolved
numba_dpex/core/_compile_helper.py Outdated Show resolved Hide resolved
numba_dpex/core/_compile_helper.py Outdated Show resolved Hide resolved
np.copyto(obj._orig_val, obj._packed_val)

def __init__(
self, kernel_name, arg_list, argty_list, access_specifiers_list, queue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to use consistent variable naming, like pyfunc_name instead of kernel_name

compile_flags=None,
array_access_specifiers=None,
):
self.typingctx = dpex_target.typing_context
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dpex_target is a global object coming from an external module, why do we always keep it in some local variable? Is there any specific reason for this?

)
func = cres.library.get_function(cres.fndesc.llvm_func_name)
cres.target_context.mark_ocl_device(func)
devfn = DpexFunction(cres)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to cache compiled func?

numba_dpex/core/kernel_interface/func.py Show resolved Hide resolved
numba_dpex/decorators.py Outdated Show resolved Hide resolved
@diptorupd diptorupd force-pushed the refactor/kernel_interfaces branch 2 times, most recently from 26c4334 to 4656193 Compare December 12, 2022 20:14
@github-actions
Copy link
Contributor

Documentation preview: show.

Diptorup Deb and others added 16 commits January 18, 2023 16:04
   - The compute follows data checking is now based on queue
     equality.
   - USMNdArray no longer requires usm_type and device
     during construction. It allows us to specialize an usm_ndarray
     only on ndims, layout and dtype.
   - No check for compute follows data for eager compilation.
   - Change caching to not require backend and device-type.
   - Fixes to test cases.
   - The DEFAULT_LOCAL_SIZE is deprecated and users warned to
     provided a valid local range for nd_range kernels.
   - Removed the global_range and local_range kw args from
     JitKernel.__call__().
   - Undeprecate the JitKernel.__getitem__ call.
   - Fix and improve how arguments to JitKernel.__call__() are
     parsed to extract the global_range and local_range.
@diptorupd diptorupd force-pushed the refactor/kernel_interfaces branch from 637a04d to c74ea24 Compare January 18, 2023 22:04
@github-actions
Copy link
Contributor

Documentation preview: show.

@diptorupd
Copy link
Contributor Author

Merging as TeamCity CI is all green.

@diptorupd diptorupd merged commit 187782d into main Jan 18, 2023
@diptorupd diptorupd deleted the refactor/kernel_interfaces branch January 18, 2023 23:41
@github-actions
Copy link
Contributor

Documentation preview removed.

github-actions bot added a commit that referenced this pull request Jan 18, 2023
fcharras added a commit to fcharras/numba-dpex that referenced this pull request Feb 3, 2023
I don't have a minimal reproducer yet but I can say that this fixes the issues I've had after IntelPython#804
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implementing a more robust caching mechanism
3 participants