Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Consider disabling --expt-relaxed-constexpr #7795

Open
jrhemstad opened this issue Mar 31, 2021 · 8 comments
Open

[FEA] Consider disabling --expt-relaxed-constexpr #7795

jrhemstad opened this issue Mar 31, 2021 · 8 comments
Assignees
Labels
0 - Backlog In queue waiting for assignment feature request New feature or request libcudf Affects libcudf (C++/CUDA) code.

Comments

@jrhemstad
Copy link
Contributor

jrhemstad commented Mar 31, 2021

Is your feature request related to a problem? Please describe.

--expt-relaxed-constexpr is a convenient way to reuse existing constexpr host code, e.g., things like std::max.

However, it can lead to some pretty surprising behavior. Consider:

constexpr int bar(int j){
    if(j<0){
        throw;
    }
    return 42;
}
__global__ void kernel(int * i){
    *i = bar(-1);
}

https://godbolt.org/z/frb8c6cd7

One might expect this to fail to compile as throw is not valid in device code. However, not only does it happily compile, but it just stores the value 42.

This example looks pretty harmless:

int foo(int i){
    return i * 2;
}
constexpr int bar(int j){
    if(j<0){
        return foo(j);
    }
    return 42;
}
__global__ void kernel( int * i){
    *i = bar(-1);
}

But this too results in an ill-formed program without a diagnostic.

https://godbolt.org/z/aTzGaMrGd

Describe the solution you'd like

We should think pretty hard about if we want to risk such egregious undefined behavior in libcudf.

As such, we may want to consider moving towards disabling --expt-relaxed-constexpr. At the very least, we should be preferring CUDA_HOST_DEVICE_CALLABLE whenever possible (for functions that need be called from both host and device).

Additional Context

The only place it is 100% safe to use a constexpr function in device code with --expt-relaxed-constexpr is when used in a context that requires constant evaluation. Then it will fail to compile if the constexpr function contains things that would result in an ill-formed program: https://godbolt.org/z/47qfnPnc9

@jrhemstad jrhemstad added feature request New feature or request Needs Triage Need team to review and classify labels Mar 31, 2021
@jrhemstad jrhemstad added libcudf Affects libcudf (C++/CUDA) code. code quality and removed Needs Triage Need team to review and classify labels Mar 31, 2021
@harrism
Copy link
Member

harrism commented Apr 1, 2021

@karthikeyann please be aware of this with respect to #7713

@harrism
Copy link
Member

harrism commented Apr 1, 2021

At the very least, we should be preferring CUDA_HOST_DEVICE_CALLABLE whenever possible.

No, we should prefer CUDA_DEVICE_CALLABLE or no annotation whenever possible. We should use CUDA_HOST_DEVICE_CALLABLE only for functions that are required to be called from both host and device.

@jrhemstad
Copy link
Contributor Author

only for functions that are required to be called from both host and device.

This was implied. The whole point of --expt-relaxed-constexpr is to call existing host constexpr functions from device, which implies it can be/is called from both host and device.

@harrism
Copy link
Member

harrism commented Apr 5, 2021

I want to be very explicit because it is not obvious to everyone (a fact that is evident in our code).

@github-actions
Copy link

github-actions bot commented May 5, 2021

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

@harrism
Copy link
Member

harrism commented May 6, 2021

Still valid

@PointKernel
Copy link
Member

This seems to be unblocked with the recent CCCL update.

The changes in NVIDIA cuCollections PR #595 show that cuco can now build and run properly without relaxed constexprs.

mbeutel added a commit to mbeutel/gsl-lite that referenced this issue Oct 31, 2024
Previously, calling `narrow<>()` from CUDA device code should have resulted
in a compile error but but did not due to a weird bug in NVCC pertaining to
the experimental "--expt-relaxed-constexpr" flag:

rapidsai/cudf#7795

This commit makes two changes to mitigate this problem:

(1) `narrow<>()` no longer responds to `gsl_CONFIG_CONTRACT_VIOLATION_THROWS`
    because it does not do contract checking. Therefore, it plainly fails to
    compile for `gsl_CONFIG( NARROW_THROWS_ON_TRUNCATION )` if exceptions are
    unavailable (e.g. in device code).

(2) For `!gsl_CONFIG( NARROW_THROWS_ON_TRUNCATION )`, `narrow<>()` now makes
    sure that the program is terminated by issuing a trap instruction if
    `std::terminate()` is not available.
mbeutel added a commit to gsl-lite/gsl-lite that referenced this issue Oct 31, 2024
* Do not call `std::terminate()` in CUDA device code

Previously, calling `narrow<>()` from CUDA device code should have resulted
in a compile error but but did not due to a weird bug in NVCC pertaining to
the experimental "--expt-relaxed-constexpr" flag:

rapidsai/cudf#7795

This commit makes two changes to mitigate this problem:

(1) `narrow<>()` no longer responds to `gsl_CONFIG_CONTRACT_VIOLATION_THROWS`
    because it does not do contract checking. Therefore, it plainly fails to
    compile for `gsl_CONFIG( NARROW_THROWS_ON_TRUNCATION )` if exceptions are
    unavailable (e.g. in device code).

(2) For `!gsl_CONFIG( NARROW_THROWS_ON_TRUNCATION )`, `narrow<>()` now makes
    sure that the program is terminated by issuing a trap instruction if
    `std::terminate()` is not available.

Some drive-by changes:

* Update Xcode toolset versions [ci skip]

* Update CI configuration, add newer compilers

* Remove GCC 10 from macOS test suite

Results in weird errors in libunwind ("_Unwind_GetTextRelBase - _Unwind_GetTextRelBase() not implemented") which I have no interest in tracking down.
@vyasr
Copy link
Contributor

vyasr commented Dec 6, 2024

NVIDIA/cuCollections#595 and rapidsai/cuspatial#1494 inspired me to look into this again. It's slow going (but low effort) to investigate, so I'll report back when I get compilation far enough to make a determination on whether removing this setting will be possible in cudf. I've run into one very odd case that seemed like a compiler bug, but other than that it seems largely like a lot of tedious but not difficult work.

rapids-bot bot pushed a commit that referenced this issue Dec 13, 2024
rapids-bot bot pushed a commit that referenced this issue Jan 6, 2025
Contributes to #7795

This PR updates `text` to build without depending on the relaxed constexpr build option.

Authors:
  - Yunsong Wang (https://github.com/PointKernel)

Approvers:
  - Basit Ayantunde (https://github.com/lamarrr)
  - Bradley Dice (https://github.com/bdice)
  - David Wendt (https://github.com/davidwendt)

URL: #17647
rapids-bot bot pushed a commit that referenced this issue Jan 6, 2025
Contributes to #7795

This PR updates `binaryop` to build without depending on the relaxed constexpr build option.

Authors:
  - Yunsong Wang (https://github.com/PointKernel)

Approvers:
  - David Wendt (https://github.com/davidwendt)
  - Vukasin Milovanovic (https://github.com/vuule)

URL: #17598
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0 - Backlog In queue waiting for assignment feature request New feature or request libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
Development

No branches or pull requests

4 participants