Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<format>: Misbehavior for huge strings #4479

Closed
StephanTLavavej opened this issue Mar 14, 2024 · 4 comments · Fixed by #4631
Closed

<format>: Misbehavior for huge strings #4479

StephanTLavavej opened this issue Mar 14, 2024 · 4 comments · Fixed by #4631
Labels
bug Something isn't working fixed Something works now, yay! format C++20/23 format

Comments

@StephanTLavavej
Copy link
Member

Noticed while reviewing #4438, but affects main. I don't see any Standardese allowing us to assume that sizes and widths can fit in an int.

D:\GitHub\STL\out\x64>type meow.cpp
#include <cassert>
#include <chrono>
#include <cstddef>
#include <format>
#include <print>
#include <string>
#include <string_view>
#include <tuple>
using namespace std;

static_assert(sizeof(void*) == 8, "This test requires 64-bit.");

void test_string(const size_t n) {
    const auto start = chrono::steady_clock::now();

    const string str(n, 'X');

    {
        const string result = format("{}", str);
        assert(result.size() == n);
        assert(result == str);
    }

    {
        const string result = format("{:o^{}}", str, n + 4);
        assert(result.size() == n + 4);
        assert(result.starts_with("oo"));
        assert(string_view{result}.substr(2, n) == str);
        assert(result.ends_with("oo"));
    }

    const auto finish  = chrono::steady_clock::now();
    const auto elapsed = chrono::duration_cast<chrono::seconds>(finish - start);

    println("test_string({}) passed, {} elapsed.", n, elapsed);
}

#ifdef TEST_TUPLE
void test_tuple(const size_t n) {
    const auto start = chrono::steady_clock::now();

    tuple<string> t;
    get<0>(t).resize(n, 'X');

    {
        const string result = format("{}", t);
        assert(result.size() == n + 4);
        assert(result.starts_with("(\""));
        assert(string_view{result}.substr(2, n) == get<0>(t));
        assert(result.ends_with("\")"));
    }

    {
        const string result = format("{:o^{}}", t, n + 8);
        assert(result.size() == n + 8);
        assert(result.starts_with("oo(\""));
        assert(string_view{result}.substr(4, n) == get<0>(t));
        assert(result.ends_with("\")oo"));
    }

    const auto finish  = chrono::steady_clock::now();
    const auto elapsed = chrono::duration_cast<chrono::seconds>(finish - start);

    println("test_tuple({}) passed, {} elapsed.", n, elapsed);
}
#endif // TEST_TUPLE

int main() {
    test_string(5);
    test_string(50'000'000);
    test_string(500'000'000);
    test_string(2'000'000'000);
    test_string(3'000'000'000);

#ifdef TEST_TUPLE
    test_tuple(5);
    test_tuple(50'000'000);
    test_tuple(500'000'000);
    test_tuple(2'000'000'000);
    test_tuple(3'000'000'000);
#endif // TEST_TUPLE
}
D:\GitHub\STL\out\x64>cl /EHsc /nologo /W4 /std:c++latest /MT /O2 meow.cpp && meow
meow.cpp
test_string(5) passed, 0s elapsed.
test_string(50000000) passed, 1s elapsed.
test_string(500000000) passed, 14s elapsed.
test_string(2000000000) passed, 56s elapsed.
[***SILENT CRASH***]
@cpplearner
Copy link
Contributor

cpplearner commented Mar 15, 2024

The Standard doesn't say so, but MSVC STL, fmtlib (source code), and libc++ (source code) all throw a format_error if the width (or precision) is larger than INT_MAX.

STL/stl/inc/format

Lines 1645 to 1654 in d6efe94

template <class _Handler, class _FormatArg>
_NODISCARD constexpr int _Get_dynamic_specs(const _FormatArg _Arg) {
_STL_INTERNAL_STATIC_ASSERT(_Is_any_of_v<_Handler, _Width_checker, _Precision_checker>);
const unsigned long long _Val = _STD visit_format_arg(_Handler{}, _Arg);
if (_Val > static_cast<unsigned long long>((numeric_limits<int>::max)())) {
_Throw_format_error("Number is too big.");
}
return static_cast<int>(_Val);
}

@frederick-vs-ja
Copy link
Contributor

Would fixing this need to break ABI as we've already stored _Width and _Precision as ints for C++20 formatters?

int _Width = 0;
int _Precision = -1;

(C++23 formatters can probably be fixed because ABI-breakage is still allowed.)

@StephanTLavavej
Copy link
Member Author

We talked about this at the weekly maintainer meeting and we think we should throw an exception for these pathological cases, followed by reconsidering the int restriction in vNext.

@cpplearner
Copy link
Contributor

cpplearner commented Mar 23, 2024

We already throw an exception if the width is too large, and we don't need to reject oversized strings: there will simply be no padding if the string is longer than the requested width.

I can't check right now, but I think the string formatter already works correctly using _Measure_string_prefix (which also handles the precision). The new tuple formatter should also use it to compute the display width of the output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working fixed Something works now, yay! format C++20/23 format
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants