-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor/kernel interfaces #804
Conversation
736bff3
to
6b179af
Compare
6b179af
to
d426eb2
Compare
f489b60
to
921624e
Compare
2d98ded
to
b2ed65d
Compare
f0e4ed0
to
211d5d8
Compare
@mingjie-intel @chudur-budur All existing tests except caching work with the new API. I have updated the description to capture pending TODOs. Any early feedback will be very helpful. |
extra_compile_flags=extra_compile_flags, | ||
) | ||
|
||
self._target_context = cres.target_context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@diptorupd Why we are not doing the same as it has been done in _compile()
?
f99df29
to
b43201c
Compare
27490c8
to
17676d2
Compare
np.copyto(obj._orig_val, obj._packed_val) | ||
|
||
def __init__( | ||
self, kernel_name, arg_list, argty_list, access_specifiers_list, queue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to use consistent variable naming, like pyfunc_name
instead of kernel_name
compile_flags=None, | ||
array_access_specifiers=None, | ||
): | ||
self.typingctx = dpex_target.typing_context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dpex_target
is a global object coming from an external module, why do we always keep it in some local variable? Is there any specific reason for this?
) | ||
func = cres.library.get_function(cres.fndesc.llvm_func_name) | ||
cres.target_context.mark_ocl_device(func) | ||
devfn = DpexFunction(cres) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to cache compiled func?
26c4334
to
4656193
Compare
Documentation preview: show. |
- The compute follows data checking is now based on queue equality. - USMNdArray no longer requires usm_type and device during construction. It allows us to specialize an usm_ndarray only on ndims, layout and dtype. - No check for compute follows data for eager compilation. - Change caching to not require backend and device-type. - Fixes to test cases.
- The DEFAULT_LOCAL_SIZE is deprecated and users warned to provided a valid local range for nd_range kernels. - Removed the global_range and local_range kw args from JitKernel.__call__(). - Undeprecate the JitKernel.__getitem__ call. - Fix and improve how arguments to JitKernel.__call__() are parsed to extract the global_range and local_range.
637a04d
to
c74ea24
Compare
Documentation preview: show. |
Merging as TeamCity CI is all green. |
Documentation preview removed. |
Refactor/kernel interfaces 187782d
I don't have a minimal reproducer yet but I can say that this fixes the issues I've had after IntelPython#804
Have you provided a meaningful PR description?
numba_dpex\compiler.py
mixes both things, making hard the separation of compute-follows-data based kernel launch from legacydpctl.device_context
based behavior.dpctl.device_context
for kernels.__getitem__
to provide global and local ranges for a kernel launch. (to be reevaluated Deprecate __getitem__ support in numba_dpex.kernel #790)numba_dpex.core.kernel_interface
.DpexFunc
to new APInumba_dpex.compiler.py
Have you added a test, reproducer or referred to an issue with a reproducer?
Have you tested your changes locally for CPU and GPU devices?
Have you made sure that new changes do not introduce compiler warnings?
If this PR is a work in progress, are you filing the PR as a draft?
Fixes #814, #816, #780, #810