[SYCL][CUDA] Return invalid subgroup size warning #6183

JackAKirk · 2022-05-23T08:53:43Z

This is a solution to #6103 for the CUDA case only. HIP AMD case still needs to be considered as discussed here: #6103 (comment).

CUDA only currently supports one subgroup (warp) size : 32 for all devices.
This PR introduces a solution to #6103 appropriate for backends which only support a single subgroup size: if the optional kernel attribute reqd_sub_group_size() is used with the supported subgroup size then it will compile and behave as the programmer intends. If reqd_sub_group_size() is used with another incompatible subgroup size a warning is returned when compiling, such as:

reqd-sub-group-size-cuda.cpp:12:73: warning: attribute argument 8 is invalid and will be ignored; CUDA requires sub_group size 32 [-Wcuda-compat]
    h.single_task<class invalid_kernel>([=] [[sycl::reqd_sub_group_size(8)]] {});
                                                                                                                ^

Signed-off-by: JackAKirk [email protected]

Signed-off-by: JackAKirk <[email protected]>

premanandrao · 2022-05-23T12:39:13Z

Could you add a test case for this please?

al42and · 2022-05-23T13:28:38Z

I would like to point out an issue with this approach. Sometimes, we might want to compile a single binary for multiple targets (with different sub-group sizes) and choose the proper type in the runtime. This PR breaks such a workflow because it will throw a compilation error whenever we have a non-32-wide kernel and an NVPTX target, even if we don't want to use them together.

A toy example: https://gist.github.com/al42and/7e580e2202bcd28425c473cb04c8fb02. Compilation string in the first line. Works fine with the current sycl branch, does not compile with this PR.

EDIT: the problem could be avoided by #5562, BTW :)

JackAKirk · 2022-05-23T13:36:29Z

I would like to point out an issue with this approach. Sometimes, we might want to compile a single binary for multiple targets (with different sub-group sizes) and choose the proper type in the runtime. This PR breaks such a workflow because it will throw a compilation error whenever we have a non-32-wide kernel and an NVPTX target, even if we don't want to use them together.

A toy example: https://gist.github.com/al42and/7e580e2202bcd28425c473cb04c8fb02. Compilation string in the first line. Works fine with the current sycl branch, does not compile with this PR.

Thanks for pointing this out: I think you are right and this use case makes this PR not a good approach.

al42and · 2022-05-23T13:44:26Z

this use case makes this PR not a good approach.

I think it might be helpful to have a compile-time diagnostic, just make it a warning instead of an error? Compiling for multiple architectures might be niche (at least now), so having a warning could be helpful for many users, even if a few has to silence it.

It would also be nice if the warning was only triggered when NVPTX is the only backend, but I suspect checking that can be non-trivial with the compilation flow used.

JackAKirk · 2022-05-23T13:45:55Z

this use case makes this PR not a good approach.

I think it might be helpful to have a compile-time diagnostic, just make it a warning instead of an error? Compiling for multiple architectures might be niche (at least now), so having a warning could be helpful for many users, even if a few has to silence it.

It would also be nice if the warning was only triggered when NVPTX is the only backend, but I suspect checking that can be non-trivial with the compilation flow used.

Thanks. I think this is a good suggestion. I will look into it.

zjin-lcf · 2022-05-23T14:15:54Z

I feel warning message is helpful.
The result of executing a HIP program fails on a MI100 GPU when the size of a wavefront is 64. The HIP program expects a wavefront of size 32.
Will the attribute "reqd_work_group_size(32)" make the HIP program succeed ? I am not clear about the answer.

JackAKirk · 2022-05-23T14:17:15Z

It would also be nice if the warning was only triggered when NVPTX is the only backend, but I suspect checking that can be non-trivial with the compilation flow used.

Yes I'm not sure of the best way to check this. I think that the warning could also be useful when NVPTX is not the only backend: not all applications that compile for multiple architectures will correctly account for the warning, and if they have accounted for the warning then they will know they can safely ignore it.

I think that I will apply your first suggestion and just switch this error to a warning. I will also add a test at this point: I guess that adding a warning is probably not going to be contentious.

JackAKirk · 2022-05-23T14:20:19Z

Will the attribute "reqd_work_group_size(32)" make the HIP program succeed ? I am not clear about the answer.

Do you mean "reqd_sub_group_size(32)"?
Not at the moment at least: The HIP AMD backend will require a proper implementation of reqd_sub_group_size(val) as discussed here: #6103 (comment)

Signed-off-by: JackAKirk <[email protected]>

zjin-lcf · 2022-05-23T16:39:24Z

It was my typo. Thank you for your answer.

Signed-off-by: JackAKirk <[email protected]>

JackAKirk · 2022-05-24T16:35:17Z

Could you add a test case for this please?

Yep done.

clang/include/clang/Basic/DiagnosticSemaKinds.td

clang/test/SemaSYCL/invalid_sg_cuda.cpp

Signed-off-by: JackAKirk <[email protected]>

elizabethandrews

Thanks!

elizabethandrews · 2022-05-25T19:18:13Z

Please update PR description. It still says error is generated.

JackAKirk · 2022-05-26T09:05:44Z

Please update PR description. It still says error is generated.

I've updated the description.

CUDA backend subgroup size error.

1545994

Signed-off-by: JackAKirk <[email protected]>

JackAKirk requested a review from a team as a code owner May 23, 2022 08:53

JackAKirk mentioned this pull request May 23, 2022

[SYCL][CUDA][HIP] Throw a runtime error with invalid sub-group size to kernel #6103

Open

JackAKirk closed this May 23, 2022

JackAKirk reopened this May 23, 2022

JackAKirk marked this pull request as draft May 23, 2022 13:46

JackAKirk closed this May 23, 2022

Switched to warning, updated message.

5eda41a

Signed-off-by: JackAKirk <[email protected]>

Warning loc corrected. Added test.

cb2bd4b

Signed-off-by: JackAKirk <[email protected]>

JackAKirk reopened this May 24, 2022

Format

137b37d

Signed-off-by: JackAKirk <[email protected]>

JackAKirk marked this pull request as ready for review May 24, 2022 16:05

JackAKirk closed this May 24, 2022

JackAKirk reopened this May 24, 2022

Format

86e26b7

Signed-off-by: JackAKirk <[email protected]>

JackAKirk closed this May 24, 2022

JackAKirk reopened this May 24, 2022

elizabethandrews reviewed May 24, 2022

View reviewed changes

clang/include/clang/Basic/DiagnosticSemaKinds.td Outdated Show resolved Hide resolved

elizabethandrews reviewed May 24, 2022

View reviewed changes

clang/test/SemaSYCL/invalid_sg_cuda.cpp Outdated Show resolved Hide resolved

Applied review suggestions.

767b6f7

Signed-off-by: JackAKirk <[email protected]>

elizabethandrews approved these changes May 25, 2022

View reviewed changes

smanna12 approved these changes May 25, 2022

View reviewed changes

JackAKirk changed the title ~~[SYCL][CUDA] Return invalid subgroup size error~~ [SYCL][CUDA] Return invalid subgroup size warning May 26, 2022

pvchupin merged commit 6dab69f into intel:sycl Jun 4, 2022

al42and mentioned this pull request Nov 21, 2023

[SYCL] [AMDGPU] Ignore incorrect sub-group size #11687

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SYCL][CUDA] Return invalid subgroup size warning #6183

[SYCL][CUDA] Return invalid subgroup size warning #6183

JackAKirk commented May 23, 2022 •

edited

Loading

premanandrao commented May 23, 2022

al42and commented May 23, 2022 •

edited

Loading

JackAKirk commented May 23, 2022

al42and commented May 23, 2022

JackAKirk commented May 23, 2022

zjin-lcf commented May 23, 2022

JackAKirk commented May 23, 2022

JackAKirk commented May 23, 2022

zjin-lcf commented May 23, 2022

JackAKirk commented May 24, 2022

elizabethandrews left a comment

elizabethandrews commented May 25, 2022

JackAKirk commented May 26, 2022

[SYCL][CUDA] Return invalid subgroup size warning #6183

[SYCL][CUDA] Return invalid subgroup size warning #6183

Conversation

JackAKirk commented May 23, 2022 • edited Loading

premanandrao commented May 23, 2022

al42and commented May 23, 2022 • edited Loading

JackAKirk commented May 23, 2022

al42and commented May 23, 2022

JackAKirk commented May 23, 2022

zjin-lcf commented May 23, 2022

JackAKirk commented May 23, 2022

JackAKirk commented May 23, 2022

zjin-lcf commented May 23, 2022

JackAKirk commented May 24, 2022

elizabethandrews left a comment

Choose a reason for hiding this comment

elizabethandrews commented May 25, 2022

JackAKirk commented May 26, 2022

JackAKirk commented May 23, 2022 •

edited

Loading

al42and commented May 23, 2022 •

edited

Loading