Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cuda safety fixes #198

Merged
merged 3 commits into from
Aug 17, 2021
Merged

Conversation

lgritz
Copy link
Contributor

@lgritz lgritz commented Aug 13, 2021

Many places (basically everywhere), we had the following idiom:

template<class T>
class Foo {
    IMATH_HOSTDEVICE float bar();    // declaration
};
...
template<class T> float Foo<T>::bar() { return 0.0; } // implementation

But this is wrong! When actually compiled in Cuda mode (maybe depending
on the compiler?), you can get errors about how you can't overload
a __host__ __device__ declaration with a __host__-only implementation.
Which kinda makes sense. You have to match the two. So I have a WHOLE LOT
of places where I had to add IMATH_HOSTDEVICE.

Also, in ImathMath.h, we used this idiom:

IMATH_HOSTDEVICE IMATH_DEPRECATED("reason") float foo() { ... }

Now, this seems to work with cudacc, but when you use
clang++ --language=cuda to compile Cuda to PTX (it can do that!), clang
really doesn't like it when the __host__ __device__ comes before the
[[ deprecated("msg") ]], it has an error about how the deprecated
attribute can't go there. So we have to transpose these so that the
IMATH_DEPRECATED is alwys the first thing in the declaration (which is
the way we almost always write it anyway).

Signed-off-by: Larry Gritz [email protected]

Many places (basically everywhere), we had the following idiom:

    template<class T>
    class Foo {
        IMATH_HOSTDEVICE float bar();    // declaration
    };
    ...
    template<class T> float Foo<T>::bar() { return 0.0; } // implementation

But this is wrong! When actually compiled in Cuda mode (maybe depending
on the compiler?), you can get errors about how you can't overload
a `__host__ __device__` declaration with a `__host__`-only implementation.
Which kinda makes sense. You have to match the two. So I have a WHOLE LOT
of places where I had to add IMATH_HOSTDEVICE.

Also, in ImathMath.h, we used this idiom:

    IMATH_HOSTDEVICE IMATH_DEPRECATED("reason") float foo() { ... }

Now, this seems to work with `cudacc`, but when you use
`clang++ --language=cuda` to compile Cuda to PTX (it can do that!), clang
really doesn't like it when the `__host__ __device__` comes before the
`[[ deprecated("msg") ]]`, it has an error about how the deprecated
attribute can't go there. So we have to transpose these so that the
IMATH_DEPRECATED is alwys the first thing in the declaration (which is
the way we almost always write it anyway).

Signed-off-by: Larry Gritz <[email protected]>
@lgritz
Copy link
Contributor Author

lgritz commented Aug 13, 2021

This PR is against RB-3.1 because I noticed the problem when trying to use 3.1. When accepted, it should also be cherry-picked into master and RB-3.0. (2.x is unnecessary because it predates the HOSTDEVICE annotations.) I would very much appreciate a 3.1.3 release that incorporates these fixes (when convenient), since the Imath headers are quite broken under Cuda without it.

Copy link
Contributor

@meshula meshula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope automation helped with this! Looks thorough and consistent

@lgritz
Copy link
Contributor Author

lgritz commented Aug 13, 2021

Some search & replace aid, but mostly manual, and double checking that my OSL Cuda builds against it stopped spitting out error messages. I think I got all the right spots.

Signed-off-by: Larry Gritz <[email protected]>
@cary-ilm
Copy link
Member

Other places that appear to be missing IMATH_HOSTDEVICE:

ImathShear.h line 681 (operator*)
ImathSphere.h line 97 (circumscribe)
ImathVec.h line 1422 (Vec3::Vec3)
ImathVec.h line 2068 (Vec4::lengthTiny)

I grepped for "inline" and looked for missing IMATH_HOSTDEVICE.

Signed-off-by: Larry Gritz <[email protected]>
@lgritz
Copy link
Contributor Author

lgritz commented Aug 17, 2021

Thanks for the extra pair of eyes, Cary!

Copy link
Member

@cary-ilm cary-ilm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cary-ilm cary-ilm merged commit 811f6ea into AcademySoftwareFoundation:RB-3.1 Aug 17, 2021
@lgritz lgritz deleted the lg-cuda branch August 25, 2021 16:40
lgritz added a commit to lgritz/Imath that referenced this pull request Aug 25, 2021
* Cuda safety fixes

Many places (basically everywhere), we had the following idiom:

    template<class T>
    class Foo {
        IMATH_HOSTDEVICE float bar();    // declaration
    };
    ...
    template<class T> float Foo<T>::bar() { return 0.0; } // implementation

But this is wrong! When actually compiled in Cuda mode (maybe depending
on the compiler?), you can get errors about how you can't overload
a `__host__ __device__` declaration with a `__host__`-only implementation.
Which kinda makes sense. You have to match the two. So I have a WHOLE LOT
of places where I had to add IMATH_HOSTDEVICE.

Also, in ImathMath.h, we used this idiom:

    IMATH_HOSTDEVICE IMATH_DEPRECATED("reason") float foo() { ... }

Now, this seems to work with `cudacc`, but when you use
`clang++ --language=cuda` to compile Cuda to PTX (it can do that!), clang
really doesn't like it when the `__host__ __device__` comes before the
`[[ deprecated("msg") ]]`, it has an error about how the deprecated
attribute can't go there. So we have to transpose these so that the
IMATH_DEPRECATED is alwys the first thing in the declaration (which is
the way we almost always write it anyway).

Signed-off-by: Larry Gritz <[email protected]>

* Add missing HOSTDEVICE

Signed-off-by: Larry Gritz <[email protected]>

* Add missing HOSTDEVICE

Signed-off-by: Larry Gritz <[email protected]>
@cary-ilm cary-ilm mentioned this pull request Aug 26, 2021
lgritz added a commit that referenced this pull request Aug 26, 2021
* Cuda safety fixes from #198

Many places (basically everywhere), we had the following idiom:

    template<class T>
    class Foo {
        IMATH_HOSTDEVICE float bar();    // declaration
    };
    ...
    template<class T> float Foo<T>::bar() { return 0.0; } // implementation

But this is wrong! When actually compiled in Cuda mode (maybe depending
on the compiler?), you can get errors about how you can't overload
a `__host__ __device__` declaration with a `__host__`-only implementation.
Which kinda makes sense. You have to match the two. So I have a WHOLE LOT
of places where I had to add IMATH_HOSTDEVICE.

Also, in ImathMath.h, we used this idiom:

    IMATH_HOSTDEVICE IMATH_DEPRECATED("reason") float foo() { ... }

Now, this seems to work with `cudacc`, but when you use
`clang++ --language=cuda` to compile Cuda to PTX (it can do that!), clang
really doesn't like it when the `__host__ __device__` comes before the
`[[ deprecated("msg") ]]`, it has an error about how the deprecated
attribute can't go there. So we have to transpose these so that the
IMATH_DEPRECATED is alwys the first thing in the declaration (which is
the way we almost always write it anyway).

* Some minor cuda corrections to address warnings

Signed-off-by: Larry Gritz <[email protected]>
@cary-ilm cary-ilm added the v3.1.3 label Sep 2, 2021
cary-ilm pushed a commit to cary-ilm/Imath that referenced this pull request Jan 17, 2022
* Cuda safety fixes from AcademySoftwareFoundation#198

Many places (basically everywhere), we had the following idiom:

    template<class T>
    class Foo {
        IMATH_HOSTDEVICE float bar();    // declaration
    };
    ...
    template<class T> float Foo<T>::bar() { return 0.0; } // implementation

But this is wrong! When actually compiled in Cuda mode (maybe depending
on the compiler?), you can get errors about how you can't overload
a `__host__ __device__` declaration with a `__host__`-only implementation.
Which kinda makes sense. You have to match the two. So I have a WHOLE LOT
of places where I had to add IMATH_HOSTDEVICE.

Also, in ImathMath.h, we used this idiom:

    IMATH_HOSTDEVICE IMATH_DEPRECATED("reason") float foo() { ... }

Now, this seems to work with `cudacc`, but when you use
`clang++ --language=cuda` to compile Cuda to PTX (it can do that!), clang
really doesn't like it when the `__host__ __device__` comes before the
`[[ deprecated("msg") ]]`, it has an error about how the deprecated
attribute can't go there. So we have to transpose these so that the
IMATH_DEPRECATED is alwys the first thing in the declaration (which is
the way we almost always write it anyway).

* Some minor cuda corrections to address warnings

Signed-off-by: Larry Gritz <[email protected]>
Signed-off-by: Cary Phillips <[email protected]>
cary-ilm pushed a commit that referenced this pull request Jan 20, 2022
* Cuda safety fixes from #198

Many places (basically everywhere), we had the following idiom:

    template<class T>
    class Foo {
        IMATH_HOSTDEVICE float bar();    // declaration
    };
    ...
    template<class T> float Foo<T>::bar() { return 0.0; } // implementation

But this is wrong! When actually compiled in Cuda mode (maybe depending
on the compiler?), you can get errors about how you can't overload
a `__host__ __device__` declaration with a `__host__`-only implementation.
Which kinda makes sense. You have to match the two. So I have a WHOLE LOT
of places where I had to add IMATH_HOSTDEVICE.

Also, in ImathMath.h, we used this idiom:

    IMATH_HOSTDEVICE IMATH_DEPRECATED("reason") float foo() { ... }

Now, this seems to work with `cudacc`, but when you use
`clang++ --language=cuda` to compile Cuda to PTX (it can do that!), clang
really doesn't like it when the `__host__ __device__` comes before the
`[[ deprecated("msg") ]]`, it has an error about how the deprecated
attribute can't go there. So we have to transpose these so that the
IMATH_DEPRECATED is alwys the first thing in the declaration (which is
the way we almost always write it anyway).

* Some minor cuda corrections to address warnings

Signed-off-by: Larry Gritz <[email protected]>
Signed-off-by: Cary Phillips <[email protected]>
cary-ilm added a commit to cary-ilm/Imath that referenced this pull request May 19, 2023
PR AcademySoftwareFoundation#198 added IMATH_HOSTDEVICE to the RB-3.1 branch, and PR AcademySoftwareFoundation#202
cherry-picked the change into main, but that cherry-pick somehow lost
the IMATH_HOSTDEVICE on Matrix33<T>::invert(bool).

Signed-off-by: Cary Phillips <[email protected]>
cary-ilm added a commit that referenced this pull request May 19, 2023
PR #198 added IMATH_HOSTDEVICE to the RB-3.1 branch, and PR #202
cherry-picked the change into main, but that cherry-pick somehow lost
the IMATH_HOSTDEVICE on Matrix33<T>::invert(bool).

Signed-off-by: Cary Phillips <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants