Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix unaligned load and stores: (#4528) #4531

Merged
merged 1 commit into from
May 31, 2023

Conversation

seelabs
Copy link
Collaborator

@seelabs seelabs commented May 22, 2023

Unaligned load and stores are supported by both intel and arm CPUs, however, this is UB in C++. Replacing this with a memcpy fixes the undefined behavior and the compiled assembly code is equivalent to the original (so there is no penalty to using memcpy).

Fix #4528

  • [ x] Bug fix (non-breaking change which fixes an issue)

@seelabs
Copy link
Collaborator Author

seelabs commented May 22, 2023

Note: I pushed a separate commit for the xxhash fix. I think the SlabAllocator fix should be uncontroversial, but the xxhash fix should be considered separately.

@@ -260,7 +260,13 @@ FORCE_INLINE U64
XXH_readLE64_align(const void* ptr, XXH_endianess endian, XXH_alignment align)
{
if (align == XXH_unaligned)
return endian == XXH_littleEndian ? A64(ptr) : XXH_swap64(A64(ptr));
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The XXHASH code has special config for this called XXH_USE_UNALIGNED_ACCESS. You can see it in https://github.com/XRPLF/rippled/blob/develop/src/ripple/beast/hash/impl/xxhash.cpp#L130 and maybe you can set this for compiling without making the changes to the external code?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like that would have a larger performance penalty than the change I made (although I admit I haven't looked into it that deeply). If people would rather leave it unmodified we can look at the perf penalty of XXH_USE_UNALIGNED_ACCESS; I think I'm weakly in favor of modifying xxhash with memcpy, but I'm fine backing out the change if there are objections.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I'm fine with modifying xxhash with memcpy(). We've had this copy of the code in our code base for nine years or so. And it looks like our copy is already not quite pristine. Putting the memcpy() where we know we need it makes the patch easy to understand. Messing with the macro introduces more variables and makes the change riskier.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okey
I think that doing this change changse the external files from the XXHASH code and making future upgrades more tricky. But this is frequently done in this code base anyways and not updating often like with the very old secp256k1 lib so maybe its okey for this also!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether using xxhash makes sense. For a lot of what's being hashed, the data is already effectively random (SHA-256 hashes and the like) so using a great hash to hash it again seems a little pointless, especially if we're having to introduce a dependency (that, as noted, isn't really kept up to date).

Copy link
Collaborator Author

@seelabs seelabs May 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Nik that hashing something that's already a SHA-256 hash doesn't add value. We could look to see if we could remove XXHASH. I'll make a issue.

Edit: Issue is here: #4547

#include <boost/align.hpp>
#include <boost/container/static_vector.hpp>
#include <boost/predef.h>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the predef ok to remove from here because the code uses it for BOOST_OS_LINUX on next ligne. Is it maybe better to remove the linux special code?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These includes are still here, I just moved them earlier in the file. (We usually put boost includes should come before system includes). This isn't needed for this patch, but I thought I'd clean that up while I was edit this file.

As for removing the linux specific code: It looks like Nik is ambivalent about the hint, and it's possible he may reconsider that, but I don't think we should touch that in this patch.

@@ -76,7 +78,9 @@ class SlabAllocator

while (data + item <= p_ + size_)
{
*reinterpret_cast<std::uint8_t**>(data) = l_;
// Use memcpy to avoid unaligned UB
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe use the std:: prefix for memcpy

Copy link
Collaborator Author

@seelabs seelabs May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in a553a13 [fold] Replace memcpy with std::memcpy

Edit: I also confirmed the compiler optimized std::memcpy the same way it did memcpy (ref: https://godbolt.org/z/4E1YT4Gs6)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine.

Copy link
Collaborator

@scottschurr scottschurr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Thanks for doing this. The added memcpy()s all look right to me. And I especially appreciate the comments indicating why the memcpy()s are there. Additionally, code coverage is excellent on all of the changed lines.

Have you had a chance to run UBSan on the modified code? It would be great if these changes fix the reports. But even if it doesn't fix the UBSan reports I think these are all good changes.

@seelabs
Copy link
Collaborator Author

seelabs commented May 22, 2023

@scottschurr Yes, I have run the sanitizer, this does not fix all the issues. The remaining issues are either in test code or part of nubd. I would like to get this down to zero, but this is a good first step.

/home/swd/projs/ripple/ours/src/test/app/ValidatorSite_test.cpp:177:16: runtime error: load of value 224, which is not a valid value for type 'bool'
/home/swd/projs/ripple/ours/src/test/beast/LexicalCast_test.cpp:41:24: runtime error: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int'
/home/swd/projs/ripple/ours/src/test/beast/LexicalCast_test.cpp:41:24: runtime error: signed integer overflow: 9223372036854775807 + 1 cannot be represented in type 'long int'
/home/swd/.conan/data/nudb/2.0.8/_/_/package/5ab84d6acfe1f23c4fae0ab88f26e3a396351ac9/include/nudb/detail/xxhash.hpp:168:53: runtime error: load of misaligned address 0x561843240f44 for type 'const uint64_t', which requires 8 byte alignment
/home/swd/.conan/data/nudb/2.0.8/_/_/package/5ab84d6acfe1f23c4fae0ab88f26e3a396351ac9/include/nudb/detail/xxhash.hpp:168:53: runtime error: load of misaligned address 0x561843240f4c for type 'const uint64_t', which requires 8 byte alignment
/home/swd/.conan/data/nudb/2.0.8/_/_/package/5ab84d6acfe1f23c4fae0ab88f26e3a396351ac9/include/nudb/detail/xxhash.hpp:168:53: runtime error: load of misaligned address 0x561843240f54 for type 'const uint64_t', which requires 8 byte alignment
/home/swd/.conan/data/nudb/2.0.8/_/_/package/5ab84d6acfe1f23c4fae0ab88f26e3a396351ac9/include/nudb/detail/xxhash.hpp:168:53: runtime error: load of misaligned address 0x561843240f5c for type 'const uint64_t', which requires 8 byte alignment
/home/swd/.conan/data/nudb/2.0.8/_/_/package/5ab84d6acfe1f23c4fae0ab88f26e3a396351ac9/include/nudb/detail/xxhash.hpp:168:53: runtime error: load of misaligned address 0x55aa1f0969c4 for type 'const uint64_t', which requires 8 byte alignment
/home/swd/.conan/data/nudb/2.0.8/_/_/package/5ab84d6acfe1f23c4fae0ab88f26e3a396351ac9/include/nudb/detail/xxhash.hpp:168:53: runtime error: load of misaligned address 0x55aa1f0969cc for type 'const uint64_t', which requires 8 byte alignment
/home/swd/.conan/data/nudb/2.0.8/_/_/package/5ab84d6acfe1f23c4fae0ab88f26e3a396351ac9/include/nudb/detail/xxhash.hpp:168:53: runtime error: load of misaligned address 0x55aa1f0969d4 for type 'const uint64_t', which requires 8 byte alignment
/home/swd/.conan/data/nudb/2.0.8/_/_/package/5ab84d6acfe1f23c4fae0ab88f26e3a396351ac9/include/nudb/detail/xxhash.hpp:168:53: runtime error: load of misaligned address 0x55aa1f0969dc for type 'const uint64_t', which requires 8 byte alignment
/home/swd/projs/ripple/ours/src/test/app/ValidatorSite_test.cpp:593:60: runtime error: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int'

@scottschurr
Copy link
Collaborator

@seelabs, sorry, I wasn't clear. I didn't expect these changes to fix all the issues. I just wanted to confirm that the memcpy()s fixed up the specific problems that we wanted to address in this pull request. Sounds like, yes they do. That's great news! Yes, an excellent first step. Thanks for doing it!

@@ -260,7 +260,13 @@ FORCE_INLINE U64
XXH_readLE64_align(const void* ptr, XXH_endianess endian, XXH_alignment align)
{
if (align == XXH_unaligned)
return endian == XXH_littleEndian ? A64(ptr) : XXH_swap64(A64(ptr));
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether using xxhash makes sense. For a lot of what's being hashed, the data is already effectively random (SHA-256 hashes and the like) so using a great hash to hash it again seems a little pointless, especially if we're having to introduce a dependency (that, as noted, isn't really kept up to date).

@@ -76,7 +78,9 @@ class SlabAllocator

while (data + item <= p_ + size_)
{
*reinterpret_cast<std::uint8_t**>(data) = l_;
// Use memcpy to avoid unaligned UB
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine.

@intelliot
Copy link
Collaborator

I'll default to holding this until 1.12 (expected ~Sept 2023) unless anyone comments that they would like to see it included in 1.11 (expected June 2023).

@intelliot intelliot added this to the 1.12 milestone May 28, 2023
Unaligned load and stores are supported by both intel and arm CPUs,
however, this is UB in C++. Replacing this with a `memcpy` fixes the
undefined behavior and the compiled assembly code is equivalent to the
original (so there is no penalty to using memcpy).
@seelabs seelabs added the Passed Passed code review & PR owner thinks it's ready to merge. Perf sign-off may still be required. label May 30, 2023
@seelabs
Copy link
Collaborator Author

seelabs commented May 30, 2023

Squahsed and forced pushed.

@seelabs
Copy link
Collaborator Author

seelabs commented May 30, 2023

@intelliot I'm fine holding this until 1.12

@intelliot
Copy link
Collaborator

This was discussed today and it is OK to include in 1.11. I intend to merge it shortly.

@intelliot intelliot removed this from the 1.12 milestone May 31, 2023
@intelliot intelliot merged commit f709311 into XRPLF:develop May 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Passed Passed code review & PR owner thinks it's ready to merge. Perf sign-off may still be required.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

[ubsan] load of misaligned address in ripple/ours/src/ripple/basics/SlabAllocator.h
6 participants