Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cached device store #1076

Merged
merged 6 commits into from
Feb 20, 2023
Merged

Cached device store #1076

merged 6 commits into from
Feb 20, 2023

Conversation

oleksandr-pavlyk
Copy link
Collaborator

@oleksandr-pavlyk oleksandr-pavlyk commented Feb 17, 2023

This PR creates dedicate queue cache with context, device pair being a key.

Use of this cache in dpctl.tensor.Device, dpctl.tensor.from_dlpack and in dpctl.memory allows to returns the same queue for a given device, streamlining user experience with compute-follows-data model.

(dev_dpctl) opavlyk@opavlyk-mobl:~/repos/dpctl$ ipython
Python 3.9.12 (main, Jun  1 2022, 11:38:51)
Type 'copyright', 'credits' or 'license' for more information
IPython 8.4.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import dpctl.tensor as dpt

In [2]: dpt.Device.create_device("gpu") == dpt.Device.create_device("gpu")
Out[2]: True

In [3]: x = dpt.linspace(0, 3 + 2j, num=1000)

In [4]: y = dpt.from_dlpack(x)

In [5]: x.sycl_queue == y.sycl_queue
Out[5]: True

In [6]: dpt.Device.create_device("level_zero:gpu") == dpt.Device.create_device("opencl:gpu")
Out[6]: False

In [7]: dpt.Device.create_device("level_zero:gpu") == dpt.Device.create_device("gpu")
Out[7]: True
(dev_dpctl) opavlyk@opavlyk-mobl:~/repos/dpctl$ ipython
Python 3.9.12 (main, Jun  1 2022, 11:38:51)
Type 'copyright', 'credits' or 'license' for more information
IPython 8.4.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import dpctl.tensor as dpt

In [2]: import dpctl.memory as dpm

In [3]: m1 = dpm.MemoryUSMShared(100)

In [4]: m2 = dpm.MemoryUSMShared(100)

In [5]: m1.sycl_queue == m2.sycl_queue
Out[5]: True

In [6]: x = dpt.linspace(0, 3 + 2j, num=1000)

In [7]: y = dpt.from_dlpack(x)

In [8]: x.sycl_queue == y.sycl_queue
Out[8]: True
  • Have you provided a meaningful PR description?
  • Have you tested your changes locally for CPU and GPU devices?
  • Have you made sure that new changes do not introduce compiler warnings?

This function caches queues by (context, device) key.
The cache is stored in contextvars.ContextVar variable, learning
our lessons from issue gh-11.

get_device_cached_queue(dev : dpctl.SyclDevice) -> dpctl.SyclQueue
get_device_cached_queue(
     (ctx: dpctl.SyclContext, dev : dpctl.SyclDevice)
) -> dpctl.SyclQueue

Function retrieves the queue from cache, or adds the new queue instance
there if previously absent.
Queue is looked up from cache for default-selected device.
@github-actions
Copy link

@coveralls
Copy link
Collaborator

coveralls commented Feb 17, 2023

Coverage Status

Coverage: 82.296% (+0.03%) from 82.269% when pulling 0ed6e67 on cached-device-store into 0a8abd8 on master.

@github-actions
Copy link

Array API standard conformance tests for dpctl=0.14.1dev2=py310h41425db_21 ran successfully.
Passed: 33
Failed: 801
Skipped: 280

@github-actions
Copy link

Array API standard conformance tests for dpctl=0.14.1dev2=py310h41425db_29 ran successfully.
Passed: 33
Failed: 801
Skipped: 280

Copy link
Collaborator

@antonwolfy antonwolfy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested together with dpnp and it works fine. Thank you! LGTM!

Copy link
Collaborator

@vlad-perevezentsev vlad-perevezentsev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! Thank you @oleksandr-pavlyk

@oleksandr-pavlyk oleksandr-pavlyk merged commit 0304cdc into master Feb 20, 2023
@oleksandr-pavlyk oleksandr-pavlyk deleted the cached-device-store branch February 20, 2023 19:00
@github-actions
Copy link

Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞

@diptorupd
Copy link
Contributor

diptorupd commented Feb 20, 2023

@oleksandr-pavlyk the API should also allow passing in a filter string otherwise using this function from inside Numba will be challenging as we will need full typing support for SyclDevice in Numba/numba-dpex.

I do not expect numba-dpex to use this functionality. usm_ndarray ctors do permit using filter selector strings. Or am I missing the point?

@github-actions
Copy link

Array API standard conformance tests for dpctl=0.14.1dev2=py310h41425db_31 ran successfully.
Passed: 33
Failed: 801
Skipped: 280

@diptorupd
Copy link
Contributor

This is the use case I had in mind:

@numba_dpex.dpjit
def foo():
    a = dpnp.ones(1024, device="gpu")
    b = dpnp.ones(1024, device="gpu")
    return a + b

To support these constructors inside the dpjit function, we will need to cache the queue. Will it be possible to use the cache in dpctl for that purpose?

@oleksandr-pavlyk
Copy link
Collaborator Author

oleksandr-pavlyk commented Feb 20, 2023

@diptorupd Yes, and I can add support for string argument for sure, but without boxing/unboxing support for dpctl.SyclQueue object it would not be possible to support arrays allocated on sub-devices.

@diptorupd
Copy link
Contributor

without boxing/unboxing support for dpctl.SyclQueue object it would not be possible to support arrays allocated on sub-devices.

yes, that is on my next set of todos

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants