-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ROCm] Improve softmax performance. #1740
base: release/2.4
Are you sure you want to change the base?
Conversation
@doru1004 UTs that reference softmax: https://github.com/search?q=repo%3Apytorch%2Fpytorch+path%3A%2F%5Etest%5C%2F%2F+softmax&type=code After that you can also try to run the full pytorch UT test suite |
Jenkins build for 211a9130ea8724d3b356368e73fd64a9a3157c83 commit finished as FAILURE Detected error during Pytorch building:
|
@jerrymannil , |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Once it is confirmed all UTs are running fine. we can merge it.
Jenkins build for 211a9130ea8724d3b356368e73fd64a9a3157c83 commit finished as FAILURE Detected error during Pytorch building:
|
Jenkins build for 211a9130ea8724d3b356368e73fd64a9a3157c83 commit finished as FAILURE Detected error during Pytorch building:
|
Jenkins build for 211a9130ea8724d3b356368e73fd64a9a3157c83 commit finished as FAILURE Detected error during Pytorch building:
|
Jenkins build for 211a9130ea8724d3b356368e73fd64a9a3157c83 commit finished as FAILURE Detected error during Pytorch building:
|
This patch improves the performance of softmax for 2D tensors by:
The impact on numerical accuracy is within a 1e-5 for half precision and 1e-7 for full precision.
The impact on performance for MI300X is between 22% and 50% percentage improvement over current runtimes.