Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clang trips -Wcast-function-type-strict during compilation of 2.2.x an 2.3.x w/ Linux 6.11.3 #16672

Closed
intelfx opened this issue Oct 21, 2024 · 2 comments · Fixed by #16673
Closed
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@intelfx
Copy link

intelfx commented Oct 21, 2024

System information

Type Version/Name
Distribution Name Arch Linux
Distribution Version N/A
Kernel Version 6.11
Architecture x86_64
OpenZFS Version 2.2.6, 2.3.0-rc2

Describe the problem you're observing

When building ZFS as an in-tree module with LLVM=1, it trips Clang warning -Wcast-function-type-strict (which becomes an error if ./configure --enable-debug is used).

Describe how to reproduce the problem

$ cd ~/linux
$ git checkout v6.11.3
$ make LLVM=1 prepare

$ cd ~/zfs
$ git checkout zfs-2.3-release
$ ./autogen.sh
$ ./configure KERNEL_LLVM=1 --prefix=/usr --with-config=all --with-linux=$HOME/linux --enable-linux-experimental --enable-linux-builtin=yes --enable-debug
$ ./copy-builtin ~/linux

$ cd ~/linux
$ make LLVM=1 oldconfig
<...enable CONFIG_ZFS...>
$ make LLVM=1 fs/zfs/

Include any warning/errors/backtraces from the system logs

$ make LLVM=1 fs/zfs/
<...>
fs/zfs/zfs/arc.c:9798:9: error: cast from 'void (*)(void *) __attribute__((noreturn))' to 'thread_func_t' (aka 'void (*)(void *)') converts to incompatible function type [-Werror,-Wcast-function-type-strict]
 9798 |         (void) thread_create(NULL, 0, l2arc_feed_thread, NULL, 0, &p0,
      |                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 9799 |             TS_RUN, defclsyspri);
      |             ~~~~~~~~~~~~~~~~~~~~
./include/zfs/os/linux/spl/sys/thread.h:53:32: note: expanded from macro 'thread_create'
   53 |         __thread_create(stk, stksize, (thread_func_t)func, #func,       \
      |                                       ^~~~~~~~~~~~~~~~~~~
fs/zfs/zfs/arc.c:9840:11: error: cast from 'void (*)(void *) __attribute__((noreturn))' to 'thread_func_t' (aka 'void (*)(void *)') converts to incompatible function type [-Werror,-Wcast-function-type-strict]
 9840 |                         (void) thread_create(NULL, 0, l2arc_dev_rebuild_thread,
      |                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 9841 |                             dev, 0, &p0, TS_RUN, minclsyspri);
      |                             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/zfs/os/linux/spl/sys/thread.h:53:32: note: expanded from macro 'thread_create'
   53 |         __thread_create(stk, stksize, (thread_func_t)func, #func,       \
      |                                       ^~~~~~~~~~~~~~~~~~~
2 errors generated.

As far as I can see, this is bogus, because the only thing that does not match in those signatures is the noreturn attribute in the rhs, which is semantically OK (it's alright if the caller expects the function to return and it doesn't, unlike the inverse).

But, LLVM's D134831 suggests that this -Wcast-function-type-strict was added in part to catch the signature mismatches that would trip CFI, and I'm not an expert in CFI.

@intelfx intelfx added the Type: Defect Incorrect behavior (e.g. crash, hang) label Oct 21, 2024
@robn
Copy link
Member

robn commented Oct 21, 2024

Thanks for the report. If you're able to, could you see if #16673 sorts this out for you?

@intelfx
Copy link
Author

intelfx commented Oct 21, 2024

@robn Yup — this is exactly what I did locally, and it indeed fixes the build.

behlendorf pushed a commit that referenced this issue Oct 21, 2024
All of our thread entry functions have this signature:

    void (*)(void*) __attribute__((noreturn))

The low-level `__thread_create()` function accepts a `thread_func_t` as
the entry point, which is defined more simply as:

    void (*)(void *)

And then the `thread_create()` and `thread_create_named()` macros cast
the passed-in function point down to `thread_func_t`, that is, casting
away the `noreturn` attribute.

Clang considers casting between these two types to be invalid because
both the caller and the callee may have elided parts of the stack frame
save and restore, knowing that they won't be needed.

Recent Linux appears to be setting `-Wcast-function-type-strict`, which
causes this invalid cast to emit a warning, which with `-Werror` is
converted to an error, breaking the build.

This commit fixes this in the simplest possible way: adding `noreturn`
to the `thread_func_t` attribute. Since all our thread entry functions
already have this attribute, it's arguably a just a consistency fix
anyway.

I considered removing the casts in the macros, which silences the
warnings, but it turns out that Clang has a bug that won't emit this
error for implicit conversions, only explicit casts. So leaving them
there seems like a reasonable belt-and-suspenders approach. Also,
frankly, this whole mechanism seems a little undercooked inside LLVM, so
I'm content go with my intuition about the smallest, least invaisve
change.

**NOTE**: `__thread_create` is exported by `spl.ko` and has a
`thread_func_t` arg, so this is an ABI break. Whether that matters in
practice, I have no idea.

Further reading:
- llvm/llvm-project@1aad641
- llvm/llvm-project#7325
- llvm/llvm-project#41465

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Closes #16672 
Closes #16673
behlendorf pushed a commit to behlendorf/zfs that referenced this issue Oct 21, 2024
All of our thread entry functions have this signature:

    void (*)(void*) __attribute__((noreturn))

The low-level `__thread_create()` function accepts a `thread_func_t` as
the entry point, which is defined more simply as:

    void (*)(void *)

And then the `thread_create()` and `thread_create_named()` macros cast
the passed-in function point down to `thread_func_t`, that is, casting
away the `noreturn` attribute.

Clang considers casting between these two types to be invalid because
both the caller and the callee may have elided parts of the stack frame
save and restore, knowing that they won't be needed.

Recent Linux appears to be setting `-Wcast-function-type-strict`, which
causes this invalid cast to emit a warning, which with `-Werror` is
converted to an error, breaking the build.

This commit fixes this in the simplest possible way: adding `noreturn`
to the `thread_func_t` attribute. Since all our thread entry functions
already have this attribute, it's arguably a just a consistency fix
anyway.

I considered removing the casts in the macros, which silences the
warnings, but it turns out that Clang has a bug that won't emit this
error for implicit conversions, only explicit casts. So leaving them
there seems like a reasonable belt-and-suspenders approach. Also,
frankly, this whole mechanism seems a little undercooked inside LLVM, so
I'm content go with my intuition about the smallest, least invaisve
change.

**NOTE**: `__thread_create` is exported by `spl.ko` and has a
`thread_func_t` arg, so this is an ABI break. Whether that matters in
practice, I have no idea.

Further reading:
- llvm/llvm-project@1aad641
- llvm/llvm-project#7325
- llvm/llvm-project#41465

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Closes openzfs#16672 
Closes openzfs#16673
behlendorf pushed a commit to behlendorf/zfs that referenced this issue Oct 21, 2024
All of our thread entry functions have this signature:

    void (*)(void*) __attribute__((noreturn))

The low-level `__thread_create()` function accepts a `thread_func_t` as
the entry point, which is defined more simply as:

    void (*)(void *)

And then the `thread_create()` and `thread_create_named()` macros cast
the passed-in function point down to `thread_func_t`, that is, casting
away the `noreturn` attribute.

Clang considers casting between these two types to be invalid because
both the caller and the callee may have elided parts of the stack frame
save and restore, knowing that they won't be needed.

Recent Linux appears to be setting `-Wcast-function-type-strict`, which
causes this invalid cast to emit a warning, which with `-Werror` is
converted to an error, breaking the build.

This commit fixes this in the simplest possible way: adding `noreturn`
to the `thread_func_t` attribute. Since all our thread entry functions
already have this attribute, it's arguably a just a consistency fix
anyway.

I considered removing the casts in the macros, which silences the
warnings, but it turns out that Clang has a bug that won't emit this
error for implicit conversions, only explicit casts. So leaving them
there seems like a reasonable belt-and-suspenders approach. Also,
frankly, this whole mechanism seems a little undercooked inside LLVM, so
I'm content go with my intuition about the smallest, least invaisve
change.

**NOTE**: `__thread_create` is exported by `spl.ko` and has a
`thread_func_t` arg, so this is an ABI break. Whether that matters in
practice, I have no idea.

Further reading:
- llvm/llvm-project@1aad641
- llvm/llvm-project#7325
- llvm/llvm-project#41465

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Closes openzfs#16672 
Closes openzfs#16673
usaleem-ix pushed a commit to truenas/zfs that referenced this issue Oct 28, 2024
All of our thread entry functions have this signature:

    void (*)(void*) __attribute__((noreturn))

The low-level `__thread_create()` function accepts a `thread_func_t` as
the entry point, which is defined more simply as:

    void (*)(void *)

And then the `thread_create()` and `thread_create_named()` macros cast
the passed-in function point down to `thread_func_t`, that is, casting
away the `noreturn` attribute.

Clang considers casting between these two types to be invalid because
both the caller and the callee may have elided parts of the stack frame
save and restore, knowing that they won't be needed.

Recent Linux appears to be setting `-Wcast-function-type-strict`, which
causes this invalid cast to emit a warning, which with `-Werror` is
converted to an error, breaking the build.

This commit fixes this in the simplest possible way: adding `noreturn`
to the `thread_func_t` attribute. Since all our thread entry functions
already have this attribute, it's arguably a just a consistency fix
anyway.

I considered removing the casts in the macros, which silences the
warnings, but it turns out that Clang has a bug that won't emit this
error for implicit conversions, only explicit casts. So leaving them
there seems like a reasonable belt-and-suspenders approach. Also,
frankly, this whole mechanism seems a little undercooked inside LLVM, so
I'm content go with my intuition about the smallest, least invaisve
change.

**NOTE**: `__thread_create` is exported by `spl.ko` and has a
`thread_func_t` arg, so this is an ABI break. Whether that matters in
practice, I have no idea.

Further reading:
- llvm/llvm-project@1aad641
- llvm/llvm-project#7325
- llvm/llvm-project#41465

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Closes openzfs#16672 
Closes openzfs#16673
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants