Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Created a temporary copy in case of overlap for unary function #1281

Merged
merged 6 commits into from
Jul 17, 2023

Conversation

antonwolfy
Copy link
Collaborator

@antonwolfy antonwolfy commented Jul 13, 2023

The PR proposes to allocate a temporary buffer rather than to raise an exception in case when the memory overlapping is detected between in and out arrays in a call of unary function.

The changed is intended to have the below code example as a valid:

import dpctl.tensor as dpt

a = dpt.arange(10, dtype='f4')
_ = dpt.sqrt(a, out=a)

The approach with temporary buffer was chosen as a start point while integrating support of the use case, since it is easiest way to implement this.
The next step might be to add separate kernels to handle in-place unary operations (when both in and out arrays point to the same memory) if the performance gain would be sensible.

  • Have you provided a meaningful PR description?
  • Have you added a test, reproducer or referred to an issue with a reproducer?
  • Have you tested your changes locally for CPU and GPU devices?
  • Have you made sure that new changes do not introduce compiler warnings?
  • Have you checked performance impact of proposed changes?
  • If this PR is a work in progress, are you opening the PR as a draft?

@antonwolfy antonwolfy self-assigned this Jul 13, 2023
@coveralls
Copy link
Collaborator

coveralls commented Jul 13, 2023

Coverage Status

Changes unknown when pulling 03a46e1 on unary_out_overlap into ** on master**.

@github-actions
Copy link

Array API standard conformance tests for dpctl=0.14.5dev1=py310h7bf5fec_7 ran successfully.
Passed: 448
Failed: 552
Skipped: 119

@github-actions
Copy link

@github-actions
Copy link

Array API standard conformance tests for dpctl=0.14.5dev1=py310h7bf5fec_12 ran successfully.
Passed: 449
Failed: 551
Skipped: 119

The call operator of this struct verifies whether two USM ND-arrays
logically address the same memory elements. In the case when
data-parallel read from and write to arrays that locally address
the same memory elements there is no race condition and no additional
copying is needed.
The predicate determines is argument arrays are the same
(same dimension, shape, data type, pointer, strides). Used
to determine if copying must be performed in case of overlap
to avoid race condition.
Of out array is logically the same as input array, there is no
race condition, so avoid performing the temporary copy.
@github-actions
Copy link

Array API standard conformance tests for dpctl=0.14.5dev1=py310h7bf5fec_15 ran successfully.
Passed: 448
Failed: 552
Skipped: 119

@github-actions
Copy link

Array API standard conformance tests for dpctl=0.14.5dev1=py310h7bf5fec_15 ran successfully.
Passed: 449
Failed: 551
Skipped: 119

@github-actions
Copy link

Array API standard conformance tests for dpctl=0.14.5dev1=py310h7bf5fec_18 ran successfully.
Passed: 447
Failed: 553
Skipped: 119

@oleksandr-pavlyk oleksandr-pavlyk merged commit d3ce80e into master Jul 17, 2023
@oleksandr-pavlyk oleksandr-pavlyk deleted the unary_out_overlap branch July 17, 2023 19:16
@github-actions
Copy link

Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞

@github-actions
Copy link

Array API standard conformance tests for dpctl=0.14.5dev1=py310h7bf5fec_18 ran successfully.
Passed: 448
Failed: 552
Skipped: 119

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants