Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support kDLCPU DLDeviceType for from_dlpack and __dlpack__ #1781

Merged
merged 22 commits into from
Aug 13, 2024

Conversation

ndgrigorian
Copy link
Collaborator

@ndgrigorian ndgrigorian commented Aug 2, 2024

This PR proposes supporting the kDLCPU DLDeviceType as a targeted device in from_dlpack and __dlpack__.

This is accomplished by __dlpack__ by copying the array data to the host, wrapping it in a NumPy array, and then returning a capsule for this NumPy array.

In from_dlpack, a NumPy array is also returned for devices with __dlpack_device__= (DLDeviceType.kDLCPU, 0). This means that from_dlpack(x_np) for some NumPy array x_np is effectively a no-op with extra validation steps returning a new Python object wrapping the same memory with zero-copy.

New utility functions numpy_to_dlpack_versioned_capsule, _numpy_array_interface_from_dl_tensor, and the class _numpy_array_interface_wrapper are introduced to enable the new functionality.

This PR also refactors from_dlpack_versioned_capsule into from_dlpack_capsule.

  • Have you provided a meaningful PR description?
  • Have you added a test, reproducer or referred to an issue with a reproducer?
  • Have you tested your changes locally for CPU and GPU devices?
  • Have you made sure that new changes do not introduce compiler warnings?
  • Have you checked performance impact of proposed changes?
  • If this PR is a work in progress, are you opening the PR as a draft?

Leverages NumPy to create an array with Python interpreter/host-accessible memory, which is required by the 2023.12 array API specification
Refactors `from_dlpack_versioned_capsule` into `from_dlpack_capsule`

`_numpy_array_interface_from_dl_tensor` now takes a pointer to a DLTensor to guarantee no copy is made
When using cimport with Numpy, a warning for a deprecated C-API is thrown

NPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION is now defined for targets in `build_dpctl_ext` to address this warning
Copy link

github-actions bot commented Aug 2, 2024

Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞

Copy link

github-actions bot commented Aug 2, 2024

Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_239 ran successfully.
Passed: 894
Failed: 1
Skipped: 119

@coveralls
Copy link
Collaborator

coveralls commented Aug 2, 2024

Coverage Status

coverage: 87.883% (-0.05%) from 87.93%
when pulling 9a17afc on feature/dlpack-kdlcpu-support
into ff0d4ea on master.

Instead cdef from `numpy/npy_no_deprecated_api.h`
Copy link

github-actions bot commented Aug 3, 2024

Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_240 ran successfully.
Passed: 894
Failed: 1
Skipped: 119

_is_host_cpu(dl_device) checks if user request export for
host CPU device. Recognized inputs are

(1, 0)
(_usmarray.DLDeviceType.kDLCPU, 0)
("kDLCPU", 0)

Add test to exercise __dlpack__ with non-default dl_device
keyword arguments
Introduce _is_host_cpu utility predicate used in __dlpack__
Copy link

github-actions bot commented Aug 5, 2024

Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_244 ran successfully.
Passed: 894
Failed: 1
Skipped: 119

@ndgrigorian ndgrigorian marked this pull request as ready for review August 5, 2024 21:55
Copy link

github-actions bot commented Aug 5, 2024

Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_248 ran successfully.
Passed: 894
Failed: 1
Skipped: 119

@ndgrigorian ndgrigorian force-pushed the feature/dlpack-kdlcpu-support branch from 5847b70 to 1544e9b Compare August 6, 2024 00:21
Copy link

github-actions bot commented Aug 6, 2024

Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_249 ran successfully.
Passed: 894
Failed: 1
Skipped: 119

Avoids possible out-of-bounds access by short-circuiting the if statement
@ndgrigorian ndgrigorian force-pushed the feature/dlpack-kdlcpu-support branch from 250f9a9 to 70c6772 Compare August 6, 2024 03:56
Copy link

github-actions bot commented Aug 6, 2024

Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_261 ran successfully.
Passed: 894
Failed: 1
Skipped: 119

Passing the used Python capsule to `_numpy_array_interface_wrapper` would cause premature deallocation of the newly created DLPack owner object, causing the dl_tensor to be prematurely deleted

Now pass the new owner object instead
Copy link

github-actions bot commented Aug 6, 2024

Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_268 ran successfully.
Passed: 894
Failed: 1
Skipped: 119

@oleksandr-pavlyk
Copy link
Collaborator

I think we must raise an exception on attempt to import np.ndarray instance to device=(kDLoneAPI, num), or support it properly. At the moment, we silently return value the np.ndarray back:

In [1]: import dpctl.tensor as dpt, numpy as np

In [2]: x = dpt.ones((2, 3))

In [3]: y = dpt.from_dlpack(x, device=(1,0))

In [4]: type(y)
Out[4]: numpy.ndarray

In [5]: z = dpt.from_dlpack(y)

In [6]: type(z)
Out[6]: numpy.ndarray

In [7]: dpt.from_dlpack(z, device=x.__dlpack_device__())
Out[7]:
array([[1., 1., 1.],
       [1., 1., 1.]], dtype=float32)

In [8]: type(_)
Out[8]: numpy.ndarray

In [9]: x.__dlpack_device__()
Out[9]: (<DLDeviceType.kDLOneAPI: 14>, 2)

I also think we ought to make DLDeviceType enum documented and accessible in a public namespace. I think dpctl.tensor is appropriate.

@oleksandr-pavlyk
Copy link
Collaborator

Actually, we must support importing dlpack capsules with kDLCPU to allow copying from an arbitrary importer (that supports exporting to "kDLCPU") via host.

@ndgrigorian
Copy link
Collaborator Author

I also think we ought to make DLDeviceType enum documented and accessible in a public namespace. I think dpctl.tensor is appropriate.

In working on this, I realized, a page with DLDeviceType enum documented would be the perfect place for other constants (dpt.pi, dpt.e, etc.)

Copy link

github-actions bot commented Aug 8, 2024

Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_281 ran successfully.
Passed: 894
Failed: 1
Skipped: 119

Simplifies check for device_type = kDLCPU and device_id = 0
Copy link

github-actions bot commented Aug 9, 2024

Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_287 ran successfully.
Passed: 894
Failed: 1
Skipped: 119

Better fits Cython style by using array-index-access to explicitly dereference pointer rather than relying on Cython to implicitly dereference it in code generation
Copy link

github-actions bot commented Aug 9, 2024

Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_292 ran successfully.
Passed: 894
Failed: 1
Skipped: 119

For arr that supports DLPack, version (1, 0),
or legacy, support

```
from_dlpack(arr, device=target_dev)
```

where target_dev is `(kDLCPU, 0)` for transfer to
host, or a value recognized by device keywords
in dpctl.tensor for other functions, or `(kDLOneAPI, dev_id)`.

To support transfer via host, `arr` must support
`__dlpack__(max_version=(1,0), dl_device=(1, 0))`.

For array objects with legacy `__dlpack__` support only,
supported inputs are those residing on kDLCPU device, or
those from kDLOneAPI device only.

---

This is a combination of 17 commits squashed into one:

Combine two validation checks into one, improving coverage

Only fall-back to __dlpack__() if requested device does not change

Simplify branching, only fall-back to no-arg call to __dlpack__ is
dl_device is None or same as reported for the input

Changed from_dlpack to copy via host is needed

This enables dpt.from_dlpack(numpy_array, device="opencl:cpu")

Add a test to exercise copy via host

Handle possibilities for TypeError and BufferError

These may be hard to test

Change exception raised by __dlpack__ if dl_device is unsupported

It used to raise NotImplementedError, not raises BufferError

Add case of dlpack test to expand coverage

Removed comment, add NotImplementedError to the except clause

To ensure same validation across branches, compute host_blob by roundtripping it through dlpack

Test from_dlpack on numpy input with strides not multiple of elementsize

Refined from_dlpack docstrings, reorged impl of from_dlpack

Used try/except/else/finally to avoid raising an exception when
another one is in flight (confusing UX).

device keyword is only allowed to be (kDLCPU, 0) or
(kDLOneAPI, num). Device keyword value is used to create output
array, rather than device_id deduced from it.

Adjusted test per change in implementation

Expand applicability of fall-back behavior

When `from_dlpack(arr, device=dev)` is called, for `arr` object
that supports legacy DLPack interface (max_version, dl_device, copy
are not supported), we now support arr being device on host, that
is (kDLCPU, 0), and (kDLOneAPI, different_device_id). Support for
this last case is being added in this commit, as per review comment.

Add symmetric support for containers with legacy DLPack support

For legacy containers, support device=(kDLCPU, 0) as well as
oneAPI device.

Add tests for importing generic legacy and generic modern containers

Fix typos in comments

Add test for legacy container holding numpy's array.
Enhance from_dlpack to support imported kDLCPU data to kDLOneAPI
@oleksandr-pavlyk oleksandr-pavlyk self-requested a review August 13, 2024 17:58
Copy link
Collaborator

@oleksandr-pavlyk oleksandr-pavlyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @ndgrigorian for working on this. It looks good to me, so I am approving.

I am going to ping @seberg to bring to his attention code for producing/consuming DLManagedTensorVersioned capsule in dpctl/tensor/_dlpack.pyx.

We merge the PR though and address his feedback, if any, later.

Copy link

Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_303 ran successfully.
Passed: 895
Failed: 0
Skipped: 119

@ndgrigorian ndgrigorian merged commit 52edb6d into master Aug 13, 2024
42 of 49 checks passed
@ndgrigorian ndgrigorian deleted the feature/dlpack-kdlcpu-support branch August 13, 2024 20:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants