-
Notifications
You must be signed in to change notification settings - Fork 752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][CUDA] Return invalid subgroup size warning #6183
Conversation
Signed-off-by: JackAKirk <[email protected]>
Could you add a test case for this please? |
I would like to point out an issue with this approach. Sometimes, we might want to compile a single binary for multiple targets (with different sub-group sizes) and choose the proper type in the runtime. This PR breaks such a workflow because it will throw a compilation error whenever we have a non-32-wide kernel and an NVPTX target, even if we don't want to use them together. A toy example: https://gist.github.com/al42and/7e580e2202bcd28425c473cb04c8fb02. Compilation string in the first line. Works fine with the current EDIT: the problem could be avoided by #5562, BTW :) |
Thanks for pointing this out: I think you are right and this use case makes this PR not a good approach. |
I think it might be helpful to have a compile-time diagnostic, just make it a warning instead of an error? Compiling for multiple architectures might be niche (at least now), so having a warning could be helpful for many users, even if a few has to silence it. It would also be nice if the warning was only triggered when NVPTX is the only backend, but I suspect checking that can be non-trivial with the compilation flow used. |
Thanks. I think this is a good suggestion. I will look into it. |
I feel warning message is helpful. |
Yes I'm not sure of the best way to check this. I think that the warning could also be useful when NVPTX is not the only backend: not all applications that compile for multiple architectures will correctly account for the warning, and if they have accounted for the warning then they will know they can safely ignore it. I think that I will apply your first suggestion and just switch this error to a warning. I will also add a test at this point: I guess that adding a warning is probably not going to be contentious. |
Do you mean "reqd_sub_group_size(32)"? |
Signed-off-by: JackAKirk <[email protected]>
It was my typo. Thank you for your answer. |
Signed-off-by: JackAKirk <[email protected]>
Signed-off-by: JackAKirk <[email protected]>
Signed-off-by: JackAKirk <[email protected]>
Yep done. |
Signed-off-by: JackAKirk <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Please update PR description. It still says error is generated. |
I've updated the description. |
This is a solution to #6103 for the CUDA case only. HIP AMD case still needs to be considered as discussed here: #6103 (comment).
CUDA only currently supports one subgroup (warp) size : 32 for all devices.
This PR introduces a solution to #6103 appropriate for backends which only support a single subgroup size: if the optional kernel attribute
reqd_sub_group_size()
is used with the supported subgroup size then it will compile and behave as the programmer intends. Ifreqd_sub_group_size()
is used with another incompatible subgroup size a warning is returned when compiling, such as:Signed-off-by: JackAKirk [email protected]