Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

util: Make Parse{Int,UInt}{32,64} use locale independent std::from_chars(…) (C++17) instead of locale dependent strto{l,ll,ul,ull} #20457

Merged
merged 1 commit into from
Sep 30, 2021

Conversation

practicalswift
Copy link
Contributor

Make Parse{Int,UInt}{32,64} use locale independent std::from_chars(…) (C++17) instead of locale dependent strto{l,ll,ul,ull}.

About std::from_chars: "Unlike other parsing functions in C++ and C libraries, std::from_chars is locale-independent, non-allocating, and non-throwing."

@laanwj
Copy link
Member

laanwj commented Nov 23, 2020

Concept ACK, see also discussion in #20452 (comment)

@DrahtBot
Copy link
Contributor

DrahtBot commented Nov 23, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #20452 (util: Replace use of locale dependent atoi(…) with locale-independent std::from_chars(…) (C++17) by practicalswift)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@practicalswift practicalswift force-pushed the locale-independent-parseint branch 3 times, most recently from 319da34 to 5df69f1 Compare November 23, 2020 15:06
maflcko pushed a commit that referenced this pull request Nov 24, 2020
… leading +/-/0:s

05c1095 test: Add testing of ParseInt/ParseUInt edge cases with leading +/-/0:s (practicalswift)

Pull request description:

  Add testing of `ParseInt`/`ParseUInt` edge cases with leading `+`/`-`/`0`:s.

  Context: While working on #20457 and #20452 I noticed some edge cases which our unit tests are currently not covering.

ACKs for top commit:
  MarcoFalke:
    review ACK 05c1095
  laanwj:
    Code review ACK 05c1095
  jonatack:
    ACK 05c1095
  promag:
    Code review ACK 05c1095.

Tree-SHA512: bdfb94d8fa0293512dbba89907cb6dd0f8b1418d878267dd6d49c8c397a0e5b9714441345565d41a6a909a1cda052ef7cccece822f355ff604fcf85f2dc8136f
@practicalswift practicalswift force-pushed the locale-independent-parseint branch from 5df69f1 to 6ad4ce3 Compare November 24, 2020 15:08
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Nov 24, 2020
…es with leading +/-/0:s

05c1095 test: Add testing of ParseInt/ParseUInt edge cases with leading +/-/0:s (practicalswift)

Pull request description:

  Add testing of `ParseInt`/`ParseUInt` edge cases with leading `+`/`-`/`0`:s.

  Context: While working on bitcoin#20457 and bitcoin#20452 I noticed some edge cases which our unit tests are currently not covering.

ACKs for top commit:
  MarcoFalke:
    review ACK 05c1095
  laanwj:
    Code review ACK 05c1095
  jonatack:
    ACK 05c1095
  promag:
    Code review ACK 05c1095.

Tree-SHA512: bdfb94d8fa0293512dbba89907cb6dd0f8b1418d878267dd6d49c8c397a0e5b9714441345565d41a6a909a1cda052ef7cccece822f355ff604fcf85f2dc8136f
@practicalswift practicalswift force-pushed the locale-independent-parseint branch from 6ad4ce3 to 5521590 Compare December 1, 2020 13:02
@practicalswift
Copy link
Contributor Author

Like #20452 this one is blocked on #20460. That's the reason for the CI failures :)

@practicalswift
Copy link
Contributor Author

Could remove "Waiting for author"? I don't know if we have any "Blocked on another PR/issue" (or "Waiting for toolchain upgrade") tag, but this PR is blocked on #20460 :)

@jonatack
Copy link
Member

jonatack commented Mar 2, 2021

Concept ACK. Looks very good; hope to see this unblocked with std::fs.

PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Sep 21, 2021
…es with leading +/-/0:s

05c1095 test: Add testing of ParseInt/ParseUInt edge cases with leading +/-/0:s (practicalswift)

Pull request description:

  Add testing of `ParseInt`/`ParseUInt` edge cases with leading `+`/`-`/`0`:s.

  Context: While working on bitcoin#20457 and bitcoin#20452 I noticed some edge cases which our unit tests are currently not covering.

ACKs for top commit:
  MarcoFalke:
    review ACK 05c1095
  laanwj:
    Code review ACK 05c1095
  jonatack:
    ACK 05c1095
  promag:
    Code review ACK 05c1095.

Tree-SHA512: bdfb94d8fa0293512dbba89907cb6dd0f8b1418d878267dd6d49c8c397a0e5b9714441345565d41a6a909a1cda052ef7cccece822f355ff604fcf85f2dc8136f
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Sep 24, 2021
…es with leading +/-/0:s

05c1095 test: Add testing of ParseInt/ParseUInt edge cases with leading +/-/0:s (practicalswift)

Pull request description:

  Add testing of `ParseInt`/`ParseUInt` edge cases with leading `+`/`-`/`0`:s.

  Context: While working on bitcoin#20457 and bitcoin#20452 I noticed some edge cases which our unit tests are currently not covering.

ACKs for top commit:
  MarcoFalke:
    review ACK 05c1095
  laanwj:
    Code review ACK 05c1095
  jonatack:
    ACK 05c1095
  promag:
    Code review ACK 05c1095.

Tree-SHA512: bdfb94d8fa0293512dbba89907cb6dd0f8b1418d878267dd6d49c8c397a0e5b9714441345565d41a6a909a1cda052ef7cccece822f355ff604fcf85f2dc8136f
fanquake added a commit that referenced this pull request Sep 28, 2021
…irements

182de7b ci: update minimum compiler requirements for std::filesystem (fanquake)
04f5baf doc: update minimum compiler requirements for std::filesystem (fanquake)

Pull request description:

  This increases the minimum required compiler versions to Clang 7 and GCC 8.1. This has been split out of #20744 (migration to `std::filesystem`), as it's also a requirement for some other changes, such as #20452 or #20457 which want to make use of `std::from_chars`. As well as #20435, which is also `std::filesystem` related. Given that the `std::filesystem` changes are moving ahead, splitting out this change to let other PRs take advantage of the new requirements seems worthwhile.

  Clang 7 has been available in Debian since [Stretch (oldoldstable)](https://packages.debian.org/stretch/clang-7) and in Ubuntu since [Bionic (18.04)](https://packages.ubuntu.com/bionic-updates/clang-7). GCC 8 has been available in Debian since [Buster (oldstable)](https://packages.debian.org/buster/gcc) and in Ubuntu since [Bionic (18.04)](https://packages.ubuntu.com/bionic/gcc-8). CentOS 8 also packages GCC 8.

  The CI changes here give us one build with GCC 8, and another using Clang 7 on top of libc++.

  Note that the minimum required libc++ in dependencies.md is unchanged as, at least for `<filesystem>`, and the `*_chars` use cases, libc++ 7 [should be sufficient](https://en.cppreference.com/w/cpp/compiler_support/17).

  I've tested that building `<filesystem>` code using Clang 7 & libc++ works. i.e `clang++-7 -std=c++17 fs.cpp -stdlib=libc++ -lc++fs`. Also that building `<filesystem>` code with Clang 7 and libstdc++ 8 works. i.e `clang++-7 -std=c++17  fs.cpp -lstdc++fs`.

ACKs for top commit:
  MarcoFalke:
    review ACK 182de7b

Tree-SHA512: 5bc151c4be58005711eed6bd8a091f3417f75a0218c11c08cffff9d749edadd965726bb7856a8e693e96e69ed0596989cda1aac4b29fb6d30705b1687a5b3363
@practicalswift
Copy link
Contributor Author

practicalswift commented Sep 29, 2021

Now that #20460 has been merged I think this PR should be ready for final review :)

See also related PR #20452.

@laanwj
Copy link
Member

laanwj commented Sep 30, 2021

Nice!
Code review ACK 4747db8

@laanwj laanwj merged commit 2d8e0c0 into bitcoin:master Sep 30, 2021
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Sep 30, 2021
if (first_nonmatching != str.data() + str.size() || error_condition != std::errc{}) {
return std::nullopt;
}
return {result};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain what the { are supposed to do? I think they can be removed, as they are confusing as-is. If you want them to check (for whatever reason) that result fits in type T, you will have to type return T{result};

Copy link
Contributor Author

@practicalswift practicalswift Oct 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this specific case return result; should be equivalent to return {result}; since the return type is std::optional<T> and result is of type T. I can see your point about braces looking a bit odd here: feel free to change :)

Some context behind the "braced return" style more generally: it can help avoid nasty implicit narrowing conversions like in the following example.

$ cat without-list-initialization.cpp
#include <cstdint>

struct F {
  F(uint8_t) {}
};

F f() { return 1024; }
$ clang++ -c without-list-initialization.cpp
# No errors
$ echo $?
0
$ cat with-list-initialization.cpp
#include <cstdint>

struct F {
  F(uint8_t) {}
};

F f() { return {1024}; }
$ clang++ -c with-list-initialization.cpp
with-list-initialization.cpp:7:18: error: constant expression evaluates to 1024 which cannot be narrowed to type 'uint8_t' (aka 'unsigned char') [-Wc++11-narrowing]
$ echo $?
1

See item (8) and "narrowing conversions" in https://en.cppreference.com/w/cpp/language/list_initialization :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

crACK

* parsed value is not in the range representable by the type T.
*/
template <typename T>
std::optional<T> ToIntegral(const std::string& str)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if it makes sense to explain in simple terms the format that is required to be parsed successfully. In regex terms it would be -?[0-9]+?

@maflcko
Copy link
Member

maflcko commented Oct 1, 2021

Created a follow-up in #23156

maflcko pushed a commit to bitcoin-core/gui that referenced this pull request Oct 4, 2021
…nd ParseDouble

fa9d72a Remove unused ParseDouble and ParsePrechecks (MarcoFalke)
fa3cd28 refactor: Remove unused ParsePrechecks from ParseIntegral (MarcoFalke)

Pull request description:

  All of the `ParsePrechecks` are already done by `ToIntegral`, so remove them from `ParseIntegral`.

  Also:
  * Remove redundant `{}`. See bitcoin/bitcoin#20457 (comment)
  * Add missing failing c-string test case
  * Add missing failing test cases for non-int32_t integral types

ACKs for top commit:
  laanwj:
    Code review ACK fa9d72a, good find on ParseDouble not being used at all, and testing for behavior of embedded NULL characters is always a good thing.
  practicalswift:
    cr ACK fa9d72a

Tree-SHA512: 3d654dcaebbf312dd57e54241f9aa6d35b1d1d213c37e4c6b8b9a69bcbe8267a397474a8b86b57740fbdd8e3d03b4cdb6a189a9eb8e05cd38035dab195410aa7
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Oct 4, 2021
…Double

fa9d72a Remove unused ParseDouble and ParsePrechecks (MarcoFalke)
fa3cd28 refactor: Remove unused ParsePrechecks from ParseIntegral (MarcoFalke)

Pull request description:

  All of the `ParsePrechecks` are already done by `ToIntegral`, so remove them from `ParseIntegral`.

  Also:
  * Remove redundant `{}`. See bitcoin#20457 (comment)
  * Add missing failing c-string test case
  * Add missing failing test cases for non-int32_t integral types

ACKs for top commit:
  laanwj:
    Code review ACK fa9d72a, good find on ParseDouble not being used at all, and testing for behavior of embedded NULL characters is always a good thing.
  practicalswift:
    cr ACK fa9d72a

Tree-SHA512: 3d654dcaebbf312dd57e54241f9aa6d35b1d1d213c37e4c6b8b9a69bcbe8267a397474a8b86b57740fbdd8e3d03b4cdb6a189a9eb8e05cd38035dab195410aa7
kwvg pushed a commit to kwvg/dash that referenced this pull request Oct 12, 2021
…es with leading +/-/0:s

05c1095 test: Add testing of ParseInt/ParseUInt edge cases with leading +/-/0:s (practicalswift)

Pull request description:

  Add testing of `ParseInt`/`ParseUInt` edge cases with leading `+`/`-`/`0`:s.

  Context: While working on bitcoin#20457 and bitcoin#20452 I noticed some edge cases which our unit tests are currently not covering.

ACKs for top commit:
  MarcoFalke:
    review ACK 05c1095
  laanwj:
    Code review ACK 05c1095
  jonatack:
    ACK 05c1095
  promag:
    Code review ACK 05c1095.

Tree-SHA512: bdfb94d8fa0293512dbba89907cb6dd0f8b1418d878267dd6d49c8c397a0e5b9714441345565d41a6a909a1cda052ef7cccece822f355ff604fcf85f2dc8136f
@rebroad
Copy link
Contributor

rebroad commented Oct 13, 2021

  CXX      qt/qt_bitcoin_qt-main.o
In file included from qt/main.cpp:5:
In file included from ./qt/bitcoin.h:12:
In file included from ./interfaces/node.h:10:
In file included from ./net.h:9:
In file included from ./addrman.h:9:
In file included from ./netaddress.h:19:
./util/strencodings.h:16:10: fatal error: 'charconv' file not found
#include <charconv>
         ^~~~~~~~~~
1 error generated.
Makefile:12673: recipe for target 'qt/qt_bitcoin_qt-main.o' failed
make: *** [qt/qt_bitcoin_qt-main.o] Error 1

@sipa
Copy link
Member

sipa commented Oct 13, 2021

@rebroad Bizarre, sounds like you're not using a C++17 compliant compiler.

@rebroad
Copy link
Contributor

rebroad commented Oct 13, 2021

@rebroad Bizarre, sounds like you're not using a C++17 compliant compiler.

Still getting the same error when compiling in a clean directory... could it be an issue caused by ccache?

@rebroad
Copy link
Contributor

rebroad commented Oct 13, 2021

4747db876154ddd828c03d9eda10ecf8b25d8dc8 is the first bad commit
commit 4747db876154ddd828c03d9eda10ecf8b25d8dc8
Author: practicalswift <[email protected]>
Date:   Sat Sep 18 04:30:30 2021 +0000

    util: Introduce ToIntegral<T>(const std::string&) for locale independent parsing using std::from_chars(…) (C++17)

    util: Avoid locale dependent functions strtol/strtoll/strtoul/strtoull in ParseInt32/ParseInt64/ParseUInt32/ParseUInt64

    fuzz: Assert equivalence between new and old Parse{Int,Uint}{8,32,64} functions

    test: Add unit tests for ToIntegral<T>(const std::string&)

:040000 040000 ec09e7aa0a2bec6f5021c2bf2994a0fc67e65c1f aa5170295604fa240939c86c02f26a6e436be964 M      src
:040000 040000 aa260b3d4bb2c554e68a7b0baa241a2cfc97818a c5d3df2e7337c8cc6a5a2141844b5c1b81a6ba51 M      test

kwvg pushed a commit to kwvg/dash that referenced this pull request Oct 21, 2022
…nt std::from_chars(…) (C++17) instead of locale dependent strto{l,ll,ul,ull}
kwvg added a commit to kwvg/dash that referenced this pull request Oct 21, 2022
…nt std::from_chars(…) (C++17) instead of locale dependent strto{l,ll,ul,ull}
kwvg added a commit to kwvg/dash that referenced this pull request Oct 27, 2022
…nt std::from_chars(…) (C++17) instead of locale dependent strto{l,ll,ul,ull}
kwvg added a commit to kwvg/dash that referenced this pull request Oct 27, 2022
…nt std::from_chars(…) (C++17) instead of locale dependent strto{l,ll,ul,ull}
kwvg added a commit to kwvg/dash that referenced this pull request Oct 30, 2022
…nt std::from_chars(…) (C++17) instead of locale dependent strto{l,ll,ul,ull}
kwvg added a commit to kwvg/dash that referenced this pull request Oct 30, 2022
…nt std::from_chars(…) (C++17) instead of locale dependent strto{l,ll,ul,ull}
@bitcoin bitcoin locked and limited conversation to collaborators Oct 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants