Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix unaligned load and stores: (#4528) #4531

Merged
merged 1 commit into from
May 31, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 18 additions & 7 deletions src/ripple/basics/SlabAllocator.h
Original file line number Diff line number Diff line change
Expand Up @@ -21,16 +21,18 @@
#define RIPPLE_BASICS_SLABALLOCATOR_H_INCLUDED

#include <ripple/beast/type_name.h>

#include <boost/align.hpp>
#include <boost/container/static_vector.hpp>
#include <boost/predef.h>

#include <algorithm>
#include <atomic>
#include <cassert>
#include <cstdint>
#include <cstring>
#include <mutex>

#include <boost/align.hpp>
#include <boost/container/static_vector.hpp>
#include <boost/predef.h>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the predef ok to remove from here because the code uses it for BOOST_OS_LINUX on next ligne. Is it maybe better to remove the linux special code?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These includes are still here, I just moved them earlier in the file. (We usually put boost includes should come before system includes). This isn't needed for this patch, but I thought I'd clean that up while I was edit this file.

As for removing the linux specific code: It looks like Nik is ambivalent about the hint, and it's possible he may reconsider that, but I don't think we should touch that in this patch.

#if BOOST_OS_LINUX
#include <sys/mman.h>
#endif
Expand Down Expand Up @@ -76,7 +78,9 @@ class SlabAllocator

while (data + item <= p_ + size_)
{
*reinterpret_cast<std::uint8_t**>(data) = l_;
// Use memcpy to avoid unaligned UB
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe use the std:: prefix for memcpy

Copy link
Collaborator Author

@seelabs seelabs May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in a553a13 [fold] Replace memcpy with std::memcpy

Edit: I also confirmed the compiler optimized std::memcpy the same way it did memcpy (ref: https://godbolt.org/z/4E1YT4Gs6)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine.

// (will optimize to equivalent code)
std::memcpy(data, &l_, sizeof(std::uint8_t*));
l_ = data;
data += item;
}
Expand Down Expand Up @@ -115,7 +119,11 @@ class SlabAllocator
ret = l_;

if (ret)
l_ = *reinterpret_cast<std::uint8_t**>(ret);
{
// Use memcpy to avoid unaligned UB
// (will optimize to equivalent code)
std::memcpy(&l_, ret, sizeof(std::uint8_t*));
}
}

return ret;
Expand All @@ -136,7 +144,10 @@ class SlabAllocator
assert(own(ptr));

std::lock_guard l(m_);
*reinterpret_cast<std::uint8_t**>(ptr) = l_;

// Use memcpy to avoid unaligned UB
// (will optimize to equivalent code)
std::memcpy(ptr, &l_, sizeof(std::uint8_t*));
l_ = ptr;
}
};
Expand Down
6 changes: 5 additions & 1 deletion src/ripple/basics/impl/partitioned_unordered_map.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,11 @@ namespace ripple {
static std::size_t
extract(uint256 const& key)
{
return *reinterpret_cast<std::size_t const*>(key.data());
std::size_t result;
// Use memcpy to avoid unaligned UB
// (will optimize to equivalent code)
std::memcpy(&result, key.data(), sizeof(std::size_t));
return result;
}

static std::size_t
Expand Down
12 changes: 10 additions & 2 deletions src/ripple/beast/hash/impl/xxhash.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@ You can contact the author at :

#include <ripple/beast/hash/impl/xxhash.h>

#include <cstring>

//**************************************
// Tuning parameters
//**************************************
Expand Down Expand Up @@ -87,7 +89,7 @@ You can contact the author at :
//**************************************
// Includes & Memory related functions
//**************************************
//#include "xxhash.h"
// #include "xxhash.h"
// Modify the local functions below should you wish to use some other memory
// routines for malloc(), free()
#include <stdlib.h>
Expand Down Expand Up @@ -260,7 +262,13 @@ FORCE_INLINE U64
XXH_readLE64_align(const void* ptr, XXH_endianess endian, XXH_alignment align)
{
if (align == XXH_unaligned)
return endian == XXH_littleEndian ? A64(ptr) : XXH_swap64(A64(ptr));
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The XXHASH code has special config for this called XXH_USE_UNALIGNED_ACCESS. You can see it in https://github.com/XRPLF/rippled/blob/develop/src/ripple/beast/hash/impl/xxhash.cpp#L130 and maybe you can set this for compiling without making the changes to the external code?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like that would have a larger performance penalty than the change I made (although I admit I haven't looked into it that deeply). If people would rather leave it unmodified we can look at the perf penalty of XXH_USE_UNALIGNED_ACCESS; I think I'm weakly in favor of modifying xxhash with memcpy, but I'm fine backing out the change if there are objections.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I'm fine with modifying xxhash with memcpy(). We've had this copy of the code in our code base for nine years or so. And it looks like our copy is already not quite pristine. Putting the memcpy() where we know we need it makes the patch easy to understand. Messing with the macro introduces more variables and makes the change riskier.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okey
I think that doing this change changse the external files from the XXHASH code and making future upgrades more tricky. But this is frequently done in this code base anyways and not updating often like with the very old secp256k1 lib so maybe its okey for this also!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether using xxhash makes sense. For a lot of what's being hashed, the data is already effectively random (SHA-256 hashes and the like) so using a great hash to hash it again seems a little pointless, especially if we're having to introduce a dependency (that, as noted, isn't really kept up to date).

Copy link
Collaborator Author

@seelabs seelabs May 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Nik that hashing something that's already a SHA-256 hash doesn't add value. We could look to see if we could remove XXHASH. I'll make a issue.

Edit: Issue is here: #4547

// Use memcpy to avoid unaligned UB
U64 tmp_aligned;
std::memcpy(&tmp_aligned, ptr, sizeof(U64));
return endian == XXH_littleEndian ? tmp_aligned
: XXH_swap64(tmp_aligned);
}
else
return endian == XXH_littleEndian ? *(U64*)ptr : XXH_swap64(*(U64*)ptr);
}
Expand Down