-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimized in-place operators for rows and matrices #1244
Conversation
- In-place operations from C-contiguous rows into C-contiguous matrices show significant performance improvements
View rendered docs @ https://intelpython.github.io/dpctl/pulls/1244/index.html |
Array API standard conformance tests for dpctl=0.14.3=py310h7bf5fec_4 ran successfully. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Array API standard conformance tests for dpctl=0.14.3=py310h7bf5fec_5 ran successfully. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @ndgrigorian
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Thank you @ndgrigorian
- By using _type_utils._can_cast, prevents failures on platforms where the maximal precision type may not be 64-bit
47b2921
Array API standard conformance tests for dpctl=0.14.3=py310h7bf5fec_6 ran successfully. |
Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞 |
Array API standard conformance tests for dpctl=0.14.3=py310h7bf5fec_6 ran successfully. |
This PR significant improves the performance of in-place operations between C-contiguous rows and C-contiguous matrices by implementing a variant kernel similar to that which is used for standard binary operations.