Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<regex>: R"([\d-e])" should be rejected #4995

Open
Alcaro opened this issue Sep 30, 2024 · 2 comments
Open

<regex>: R"([\d-e])" should be rejected #4995

Alcaro opened this issue Sep 30, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@Alcaro
Copy link
Contributor

Alcaro commented Sep 30, 2024

Describe the bug

The regex [\d-e] (character class containing the range \d to e) is accepted (treated as \d, and the literal characters - and e), contrary to the ECMA-262 spec (\d isn't a single character, so it can't be used like that)

Command-line test case

#include <regex>

int main()
{
    try {
        std::regex r("[\\d-e]");
        puts("it's legal");
    } catch (std::exception& e) {
        puts(e.what());
    }
    try {
        std::regex r("[b-a]");
        puts("it's legal");
    } catch (std::exception& e) {
        puts(e.what());
    }
}

https://godbolt.org/z/oMvEr5YTs

Expected behavior

Both should be illegal (currently, only the latter is rejected)

STL version

Ask Godbolt

Additional context

Feel free to close this one as wontfix, if you feel it's ossified into a vendor extension. As long as it's a conscious choice, I'm fine with whichever outcome.

@CaseyCarter CaseyCarter added bug Something isn't working decision needed We need to choose something before working on this labels Sep 30, 2024
@StephanTLavavej StephanTLavavej removed the decision needed We need to choose something before working on this label Oct 2, 2024
@StephanTLavavej StephanTLavavej changed the title <regex>: Is [\d-e] legal? <regex>: R"([\d-e])" should be rejected Oct 2, 2024
@StephanTLavavej
Copy link
Member

We talked about this at the weekly maintainer meeting and we believe that this is clearly a bug, as we should be following what ECMAScript specifies here. (Technically the C++ Standard cites ECMAScript 3, but modern versions are written in a clearer way - we can refer to them as long as we don't accidentally pick up new features.)

As C++ Standard Library implementations have wildly varying behavior, @barcharcraz suggests checking what Chromium and Firefox do.

@Alcaro
Copy link
Contributor Author

Alcaro commented Oct 2, 2024

Careful about that - browsers have plenty of regex extensions that aren't part of ESv3 either.

And even if the C++ spec is updated to cite a newer ES version, the spec explicitly calls out that they're backwards compat extensions that non-browsers should omit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants