[libc++][test] Refactor increasing_allocator #115671

winner245 · 2024-11-10T22:25:07Z

The increasing_allocator<T> class, originally introduced to test shrink_to_fit for std::vector, vector<bool>, and std::basic_string (#95161), has duplicated definitions across several test files. Given the potential utility of this class for capacity-related tests in various sequence containers, this PR proposes refactoring the definition of increasing_allocator<T> into a single, reusable location.

llvmbot · 2024-11-10T22:25:40Z

@llvm/pr-subscribers-libcxx

Author: Peng Liu (winner245)

Changes

The increasing_allocator<T> class, originally introduced to test shrink_to_fit for vector and string (#95161), does not satisfy Cpp17Allocator requirements because its allocate(n) member function may allocate more memory than requested. However, the standard ([allocator.requirements]/36) mandates that a.allocate(n) must allocate memory for an array of exactly n objects of type T.

> a.allocate(n)
35 Result: XX::pointer
36 Effects: Memory is allocated for an array of n T and such an object is created but array elements are not constructed.
[Example 1: When reusing storage denoted by some pointer value p, launder(reinterpret_cast<T*>(new (p) byte[n * sizeof(T)])) can be used to implicitly create a suitable array object and obtain a pointer to it. — end example]

This PR addresses the issue by modifying increasing_allocator<T>::allocate(n) such that it strictly allocates for exatcly n objects. Note that this change does not affect the existing tests for shrink_to_fit, as those tests only utilize the allocate_at_least member function, which is already standard conforming.

Full diff: https://github.com/llvm/llvm-project/pull/115671.diff

3 Files Affected:

(modified) libcxx/test/std/containers/sequences/vector.bool/shrink_to_fit.pass.cpp (+1-1)
(modified) libcxx/test/std/containers/sequences/vector/vector.capacity/shrink_to_fit.pass.cpp (+1-1)
(modified) libcxx/test/std/strings/basic.string/string.capacity/shrink_to_fit.pass.cpp (+2-2)

diff --git a/libcxx/test/std/containers/sequences/vector.bool/shrink_to_fit.pass.cpp b/libcxx/test/std/containers/sequences/vector.bool/shrink_to_fit.pass.cpp
index f8bcee31964bbb..136b151efa29ef 100644
--- a/libcxx/test/std/containers/sequences/vector.bool/shrink_to_fit.pass.cpp
+++ b/libcxx/test/std/containers/sequences/vector.bool/shrink_to_fit.pass.cpp
@@ -55,7 +55,7 @@ struct increasing_allocator {
     min_elements += 1000;
     return std::allocator<T>{}.allocate_at_least(n);
   }
-  constexpr T* allocate(std::size_t n) { return allocate_at_least(n).ptr; }
+  constexpr T* allocate(std::size_t n) { return std::allocator<T>{}.allocate(n); }
   constexpr void deallocate(T* p, std::size_t n) noexcept { std::allocator<T>{}.deallocate(p, n); }
 };
 
diff --git a/libcxx/test/std/containers/sequences/vector/vector.capacity/shrink_to_fit.pass.cpp b/libcxx/test/std/containers/sequences/vector/vector.capacity/shrink_to_fit.pass.cpp
index e39afb2d48f0a0..97d67dac2baa8c 100644
--- a/libcxx/test/std/containers/sequences/vector/vector.capacity/shrink_to_fit.pass.cpp
+++ b/libcxx/test/std/containers/sequences/vector/vector.capacity/shrink_to_fit.pass.cpp
@@ -87,7 +87,7 @@ struct increasing_allocator {
     min_elements += 1000;
     return std::allocator<T>{}.allocate_at_least(n);
   }
-  constexpr T* allocate(std::size_t n) { return allocate_at_least(n).ptr; }
+  constexpr T* allocate(std::size_t n) { return std::allocator<T>{}.allocate(n); }
   constexpr void deallocate(T* p, std::size_t n) noexcept { std::allocator<T>{}.deallocate(p, n); }
 };
 
diff --git a/libcxx/test/std/strings/basic.string/string.capacity/shrink_to_fit.pass.cpp b/libcxx/test/std/strings/basic.string/string.capacity/shrink_to_fit.pass.cpp
index 6f5e43d1341f53..68360329308bab 100644
--- a/libcxx/test/std/strings/basic.string/string.capacity/shrink_to_fit.pass.cpp
+++ b/libcxx/test/std/strings/basic.string/string.capacity/shrink_to_fit.pass.cpp
@@ -79,8 +79,8 @@ struct increasing_allocator {
     min_bytes += 1000;
     return {static_cast<T*>(::operator new(allocation_amount)), allocation_amount / sizeof(T)};
   }
-  T* allocate(std::size_t n) { return allocate_at_least(n).ptr; }
-  void deallocate(T* p, std::size_t) noexcept { ::operator delete(static_cast<void*>(p)); }
+  T* allocate(std::size_t n) { return std::allocator<T>{}.allocate(n); }
+  void deallocate(T* p, std::size_t) noexcept { std::allocator<T>{}.deallocate(p, n); }
 };
 
 template <typename T, typename U>

libcxx/test/std/containers/sequences/vector.bool/shrink_to_fit.pass.cpp

philnik777 · 2024-11-11T14:45:03Z

I feel like the reasoning is backwards here. AFAICT the only problem is that we're mismatching the n we pass to allocator::deallocate with the n we're passing to allocator::allocate. Nothing requires that there is space for exactly n elements in an allocation - there has to be space for at least that. In fact, most allocators return more than the requested amount in at least some cases, which is the whole reason allocate_at_least was introduced.

ldionne · 2024-11-11T14:53:23Z

@winner245 Actually, how did you come across this issue? Did our test suite fail with another Standard library somehow?

winner245 · 2024-11-11T15:33:34Z

Because I just realized that the same issue #95161 applies to __split_buffer::shrink_to_fit, which can increase the capacity of the underlying buffer when the allocator's allocate_at_least allocates more memory than requested.

I was trying to find a different way to solve the shrink_to_fit issue because I think the root cause to all these shink_to_fit issues is that we used allocate_at_least instead of allocate. If we just revert back to allocate, I believe all the shinrk_to_fit issue would automatically be gone. This is because the standard ([allocator.requirements]/36) mandates that a.allocate(n) must allocate memory for exactly n objects of type T.

winner245 · 2024-11-11T15:35:52Z

I personally do not like the current solution, because allocate_at_least may end up allocating more memory than requested. Then we would have to immediately discard the newly allocated memory. If we revert back to allocate, we wouldn't have the problem.

philnik777 · 2024-11-11T15:42:26Z

I personally do not like the current solution, because allocate_at_least may end up allocating more memory than requested. Then we would have to immediately discard the newly allocated memory. If we revert back to allocate, we wouldn't have the problem.

The intention of allocate_at_least is that the allocator tells us how much it overallocated due to its allocation algorithm, not that it actually allocates more. If you implement the interface properly allocate will only hide the overallocation, not prevent it.

frederick-vs-ja · 2024-11-11T16:23:44Z

@winner245 Actually, how did you come across this issue? Did our test suite fail with another Standard library somehow?

Actually, the tests failed with MSVC STL (or MSVC, the compiler). MSVC detects size mismatch of deallocation in constant evaluation. I guess we can't reliably perform over allocation in constant evaluation.

(It seems that VCRuntime can detect size mismatch at run time.)

ldionne · 2024-11-11T21:25:43Z

The intention of allocate_at_least is that the allocator tells us how much it overallocated due to its allocation algorithm, not that it actually allocates more. If you implement the interface properly allocate will only hide the overallocation, not prevent it.

I agree, and I don't see how this makes the allocator more conforming than it used to be. However, I do think the code makes more sense after the patch than it did before the patch, because implementing allocate() using std::allocator::allocate makes more sense than doing so with allocate_at_least.

Actually, the tests failed with MSVC STL (or MSVC, the compiler). MSVC detects size mismatch of deallocation in constant evaluation. I guess we can't reliably perform over allocation in constant evaluation.

constexpr std::allocation_result<T*> allocate_at_least(std::size_t n);
constexpr T* allocate(std::size_t n) {
  auto res = allocate_at_least(n);
  return res.ptr;
}
constexpr void deallocate(T* p, std::size_t n) noexcept {
  // n may be wrong: it must match res.count which may not be the case if we used allocate_at_least above
  std::allocator<T>{}.deallocate(p, n);
}

Is that what you're saying? I would understand why the current code is a problem in that case, however http://eel.is/c++draft/allocator.members#10.1 says:

If p is memory that was obtained by a call to allocate_at_least, let ret be the value returned and req be the value passed as the first argument to that call. p is equal to ret.ptr and n is a value such that req ≤ n ≤ ret.count.

In other words, I think it's supposed to be just fine to call std::allocator::deallocate with a n that is less than the count returned by a previous allocate_at_least.

philnik777 · 2024-11-11T21:47:50Z

The intention of allocate_at_least is that the allocator tells us how much it overallocated due to its allocation algorithm, not that it actually allocates more. If you implement the interface properly allocate will only hide the overallocation, not prevent it.

I agree, and I don't see how this makes the allocator more conforming than it used to be. However, I do think the code makes more sense after the patch than it did before the patch, because implementing allocate() using std::allocator::allocate makes more sense than doing so with allocate_at_least.

I'm not disagreeing here. The code is an improvement, but I don't think the commit message makes sense.

Actually, the tests failed with MSVC STL (or MSVC, the compiler). MSVC detects size mismatch of deallocation in constant evaluation. I guess we can't reliably perform over allocation in constant evaluation.
constexpr std::allocation_result<T*> allocate_at_least(std::size_t n);
constexpr T* allocate(std::size_t n) {
  auto res = allocate_at_least(n);
  return res.ptr;
}
constexpr void deallocate(T* p, std::size_t n) noexcept {
  // n may be wrong: it must match res.count which may not be the case if we used allocate_at_least above
  std::allocator<T>{}.deallocate(p, n);
}
Is that what you're saying? I would understand why the current code is a problem in that case, however http://eel.is/c++draft/allocator.members#10.1 says:

If p is memory that was obtained by a call to allocate_at_least, let ret be the value returned and req be the value passed as the first argument to that call. p is equal to ret.ptr and n is a value such that req ≤ n ≤ ret.count.

In other words, I think it's supposed to be just fine to call std::allocator::deallocate with a n that is less than the count returned by a previous allocate_at_least.

The calling code is fine. The mismatch between the std::allocator::allocate() and std::allocator::deallocate() is the problem. Note how increasing_allocator::allocate_at_least calls std::allocator::allocate(n + 1000) in some cases, but increasing_allocator::deallocate() always just calls std::allocator::deallocate(n).

Actually, I just noticed that the current code is still wrong, since there is still potentially the same mismatch when calling increasing_allocator::allocate_at_least and increasing_allocator::deallocate.

winner245 · 2024-11-13T15:25:30Z

Thank you all for the very informative discussions here. I agree that the current solution for shrink_to_fit is the best compromise we have at this point. In response to @philnik777's comment that "allocate will only hide the overallocation, not prevent it," I think this PR now serves as a refactoring:

Currently, the increasing_allocator<T> class appears in three different files. We should factor it out into a single location.
Semantically, it makes more sense to let allocate() call std::allocator::allocate, not allocate_at_least (and let allocate_at_least() call std::allocator::allocate_at_least).

libcxx/test/support/increasing_allocator.h

ldionne

It looks like we have agreement on the path forward. I left a few additional comments.

libcxx/test/support/increasing_allocator.h

ldionne · 2024-12-02T19:25:25Z

libcxx/test/support/increasing_allocator.h

+    min_elements += 1000;
+    return std::allocator<T>{}.allocate_at_least(n);
+  }
+  constexpr T* allocate(std::size_t n) { return std::allocator<T>{}.allocate(n); }


I think we still need to make sure we always allocate an increasing amount of memory in allocate(...), otherwise this test allocator would not help catch the case where shrink_to_fit is implemented incorrectly but it uses alloc.allocate instead of alloc.allocate_at_least.

This makes sense to me. I just double checked the shrink_to_fit implementation of vector. Prior to C++23, it used allocate, and since C++23 it uses allocate_at_least . So by requiring both allocate and allocate_at_least to allocate the increasing amount of memory, we can make increasing_allocator more useful for testing both implementations of shrink_to_fit. I also made a little extra changes so that it can be used for both C++23 and pre-C++23.

github-actions · 2024-12-03T04:21:43Z

✅ With the latest revision this PR passed the C/C++ code formatter.

ldionne · 2024-12-03T18:58:48Z

libcxx/test/support/increasing_allocator.h

+#if TEST_STD_VER >= 23
+    return allocate_at_least(n).ptr;
+#else


Why don't we unconditionally call std::allocator<T>().allocate(n)?

Also, we still run into the issue that we had before, which is that we're calling std::allocator<T>().allocate(n) but calling std::allocator<T>().deallocate(ptr, m) with a smaller value than we allocated. That violates allocator.members.

We can either do what @frederick-vs-ja suggested here (which TBH I don't fully understand why that makes the allocator ever-increasing), or we could try something like this: https://godbolt.org/z/P8PK6hvEj

template <typename T> struct increasing_allocator { using value_type = T; std::size_t min_elements = 1000; increasing_allocator() = default; private: tiny_map<void*, std::size_t> actual_sizes_; public: template <typename U> constexpr increasing_allocator(const increasing_allocator<U>& other) noexcept : min_elements(other.min_elements), actual_sizes_(other.actual_sizes_) {} constexpr std::allocation_result<T*> allocate_at_least(std::size_t n) { if (n < min_elements) n = min_elements; min_elements += 1000; std::allocation_result<T*> result = std::allocator<T>().allocate_at_least(n); actual_sizes_.add(result.ptr, n); return result; } constexpr T* allocate(std::size_t n) { if (n < min_elements) n = min_elements; min_elements += 1000; T* result = std::allocator<T>().allocate(n); actual_sizes_.add(result, n); return result; } constexpr void deallocate(T* p, std::size_t n) noexcept { auto actual_size = actual_sizes_.find(p); std::allocator<T>().deallocate(p, actual_size->second); actual_sizes_.remove(p); } friend constexpr bool operator==(increasing_allocator const& a, increasing_allocator const& b) noexcept { return a.actual_sizes_ == b.actual_sizes_; } };

This is a lot more complicated but I think it works. If you agree with that direction, we could land this PR as-is and then tackle this additional change as a follow-up, since it can be done independently from this refactoring.

Thank you for your detailed feedback!

Why don't we unconditionally call std::allocator<T>().allocate(n)?

I agree that it's better to unconditionally call std::allocator<T>().allocate(n) within increasing_allocator::allocate(n). I also think that we should allow only the allocate_at_least(n) function of increasing_allocator to allocate more memory than the requested amount n, i.e., it actually allocates max(n, min_elements). However, I think it makes more sense for increasing_allocator::allocate(n) to allocate exactly or effectively the amount n, otherwise we will see a mismatch between the reported and the actual allocation counts, as shown in this example. This mismatch between the reported allocation and actual allocation could cause further misreporting in vector::capacity() and make shrink_to_fit malfunction, as shown in this demo.

The misreporting arises because the following code in the else branch always reports __n, which is correct only if __alloc.allocate(__n) actually allocates exactly __n. However, if our implementation allows increasing_allocator::allocate(n) to allocate more than n, we would trigger the misreporting problem.

template <class _Ap = _Alloc> [[nodiscard]] _LIBCPP_HIDE_FROM_ABI static constexpr allocation_result<pointer, size_type> allocate_at_least(_Ap& __alloc, size_type __n) { if constexpr (requires { __alloc.allocate_at_least(__n); }) { return __alloc.allocate_at_least(__n); } else { // This always reports n back. If allocate(n) allocates more than n, we have a report/actual mismatch return {__alloc.allocate(__n), __n}; } }

Therefore, allocate(n) should allocate exactly n and allocate_at_least(n) should allocate at least n, as their names suggest.

Also, we still run into the issue that we had before, which is that we're calling std::allocator().allocate(n) but calling std::allocator().deallocate(ptr, m) with a smaller value than we allocated. That violates allocator.members.

We can either do what @frederick-vs-ja suggested here (which TBH I don't fully understand why that makes the allocator ever-increasing), or we could try something like this: godbolt.org/z/P8PK6hvEj

I agree that the allocation/deallocation-size mismatch between allocate(n)/allocate_at_least(n) and deallocate(p, m) always exists because it completely depends on the usages of the functions. Users may provide mismatched m/n values, regardless of our implementation. I think @frederick-vs-ja's idea provides a partial solution or it alleviates this issue by quantizing the allocation/deallocation sizes into discrete power-of-2 values. This allows for mismatches between n and m, as long as they are quantized to the same power of 2. However, completely irrelevant m, n values yielding different equitizations may still cause mismatches.

To fully resolve the allocation/deallocation-size mismatch, we might need to record the actual allocation size during each allocation, similar to how you store the allocation size for each allocated memory p. This way, regardless of what m was provided during deallocate(p, m), we can always determine the actual allocation size from p. Your code provides a neat implementation of this approach.

This is a lot more complicated but I think it works. If you agree with that direction, we could land this PR as-is and then tackle this additional change as a follow-up, since it can be done independently from this refactoring.

I agree it's getting a bit complicated but it seems worth proceeding. I agree with your direction. I'll keep the current refactoring in this PR as I really want the increasing_allocator utility in a single location, so that other work (including my own ongoing work) can use this class without duplicating the code. Follow-up work to further improve this class can be done independently after this PR.

Maybe another way to avoid the size mismatch problem is to use operator new and delete, which eliminates the need to manually specify the deallocation size:

template <typename T> struct increasing_allocator { using value_type = T; std::size_t min_elements = 1000; increasing_allocator() = default; #if TEST_STD_VER >= 23 std::allocation_result<T*> allocate_at_least(std::size_t n) { if (n < min_elements) n = min_elements; min_elements += 1000; return {static_cast<T*>(::operator new(n * sizeof(T))), n}; } #endif // TEST_STD_VER >= 23 T* allocate(std::size_t n) { return static_cast<T*>(::operator new(n * sizeof(T))); } void deallocate(T* p, std::size_t) TEST_NOEXCEPT { ::operator delete(p); } };

ldionne

LGTM. Let's tackle the allocation/deallocation size mismatch separately.

winner245 requested a review from a team as a code owner November 10, 2024 22:25

llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Nov 10, 2024

winner245 force-pushed the winner245/increasing_allocator branch from 4044fbe to 3402842 Compare November 10, 2024 22:40

ldionne reviewed Nov 11, 2024

View reviewed changes

libcxx/test/std/containers/sequences/vector.bool/shrink_to_fit.pass.cpp Outdated Show resolved Hide resolved

winner245 changed the title ~~[libc++][test] Fix increasing_allocator to meet Cpp17Allocator requirements~~ [libc++][test] Refactor increasing_allocator Nov 13, 2024

winner245 force-pushed the winner245/increasing_allocator branch from 3402842 to 9d584f7 Compare November 13, 2024 16:18

frederick-vs-ja reviewed Nov 15, 2024

View reviewed changes

libcxx/test/support/increasing_allocator.h Show resolved Hide resolved

ldionne reviewed Nov 28, 2024

View reviewed changes

libcxx/test/support/increasing_allocator.h Outdated Show resolved Hide resolved

libcxx/test/support/increasing_allocator.h Show resolved Hide resolved

winner245 force-pushed the winner245/increasing_allocator branch 2 times, most recently from 837b6d1 to 4696d88 Compare November 29, 2024 14:11

ldionne reviewed Dec 2, 2024

View reviewed changes

winner245 force-pushed the winner245/increasing_allocator branch from 4696d88 to 068904b Compare December 3, 2024 04:18

winner245 force-pushed the winner245/increasing_allocator branch 3 times, most recently from d725c4f to bdd094f Compare December 3, 2024 12:05

ldionne reviewed Dec 3, 2024

View reviewed changes

winner245 force-pushed the winner245/increasing_allocator branch 2 times, most recently from d6a3c9c to c27b6f6 Compare December 4, 2024 05:14

Fix increasing_allocator to conform to standard

40d7cd4

winner245 added 2 commits December 4, 2024 06:50

Factor out increasing_allocator

4ef957a

Add comment for increasing_allocator

b4aefc5

winner245 force-pushed the winner245/increasing_allocator branch from c27b6f6 to b4aefc5 Compare December 4, 2024 11:52

ldionne approved these changes Dec 5, 2024

View reviewed changes

ldionne merged commit a821937 into llvm:main Dec 5, 2024
59 of 61 checks passed

winner245 deleted the winner245/increasing_allocator branch December 6, 2024 18:50

StephanTLavavej mentioned this pull request Jan 14, 2025

Update llvm-project microsoft/STL#5235

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[libc++][test] Refactor increasing_allocator #115671

[libc++][test] Refactor increasing_allocator #115671

winner245 commented Nov 10, 2024 •

edited

Loading

llvmbot commented Nov 10, 2024

philnik777 commented Nov 11, 2024

ldionne commented Nov 11, 2024

winner245 commented Nov 11, 2024 •

edited

Loading

winner245 commented Nov 11, 2024

philnik777 commented Nov 11, 2024

frederick-vs-ja commented Nov 11, 2024 •

edited

Loading

ldionne commented Nov 11, 2024

philnik777 commented Nov 11, 2024

winner245 commented Nov 13, 2024

ldionne left a comment

ldionne Dec 2, 2024

winner245 Dec 3, 2024 •

edited

Loading

github-actions bot commented Dec 3, 2024 •

edited

Loading

ldionne Dec 3, 2024

winner245 Dec 4, 2024 •

edited

Loading

winner245 Dec 4, 2024

ldionne left a comment

[libc++][test] Refactor increasing_allocator #115671

[libc++][test] Refactor increasing_allocator #115671

Conversation

winner245 commented Nov 10, 2024 • edited Loading

llvmbot commented Nov 10, 2024

philnik777 commented Nov 11, 2024

ldionne commented Nov 11, 2024

winner245 commented Nov 11, 2024 • edited Loading

winner245 commented Nov 11, 2024

philnik777 commented Nov 11, 2024

frederick-vs-ja commented Nov 11, 2024 • edited Loading

ldionne commented Nov 11, 2024

philnik777 commented Nov 11, 2024

winner245 commented Nov 13, 2024

ldionne left a comment

Choose a reason for hiding this comment

ldionne Dec 2, 2024

Choose a reason for hiding this comment

winner245 Dec 3, 2024 • edited Loading

Choose a reason for hiding this comment

github-actions bot commented Dec 3, 2024 • edited Loading

ldionne Dec 3, 2024

Choose a reason for hiding this comment

winner245 Dec 4, 2024 • edited Loading

Choose a reason for hiding this comment

winner245 Dec 4, 2024

Choose a reason for hiding this comment

ldionne left a comment

Choose a reason for hiding this comment

winner245 commented Nov 10, 2024 •

edited

Loading

winner245 commented Nov 11, 2024 •

edited

Loading

frederick-vs-ja commented Nov 11, 2024 •

edited

Loading

winner245 Dec 3, 2024 •

edited

Loading

github-actions bot commented Dec 3, 2024 •

edited

Loading

winner245 Dec 4, 2024 •

edited

Loading