-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ignore quantifiers when splitting comma-separated regexes #7241
Conversation
Do not split on commas if they are between braces, since that indicates a quantifier. Also added a protection for slow implementations since existing workarounds may result in long strings of chained regular expressions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good already, thank you ! I'll put the "need a decision" label so we don't merge it before we decide what we do in 3.0.
regexps: deque[deque[str] | None] = deque([None]) | ||
open_braces = False | ||
for char in value: | ||
if char == "{": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on the result of the discussion we should warn the user of deprecation here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't it be on line 346? That's where we split regular expressions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, my bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know a year ago we were floating the idea of deprecations, but we're not ready to commit to a new plan IMO.
Pull Request Test Coverage Report for Build 2762066289
π - Coveralls |
Overkill, slows down unit tests Co-authored-by: Pierre Sassoulas <[email protected]>
π€ According to the primer, this change has no effect on the checked open source code. π€π This comment was generated for commit b48c339 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! I left cosmetic feedback only. Thanks for adding a regression test that fails on main.
I think we should merge. We should not let there be silent parsing failures when trying to use regular expressions. (What a gotcha!)
@@ -0,0 +1,5 @@ | |||
When parsing comma-separated lists of regular expressions in the config, ignore | |||
commas that are inside braces since those indicate quantiers, not dilineation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commas that are inside braces since those indicate quantiers, not dilineation | |
commas that are inside braces since those indicate quantifiers, not delineation |
@@ -5,6 +5,8 @@ | |||
from __future__ import annotations | |||
|
|||
import os | |||
import re | |||
import timeit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import timeit |
assert _template_run(in_string) == [re.compile(regex) for regex in expected] | ||
|
||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lbenezriravin If you enable maintainers to push to this fork I can resolve the merge conflicts for you and add the cosmetic changes. |
@@ -34,6 +35,7 @@ | |||
"HAS_ISORT_5", | |||
"IsortDriver", | |||
"_check_csv", | |||
"_check_regexp_csv", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is private, this should not be in __all__
.
"_check_regexp_csv", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR currently uses it on the module, so it has to stay, unless we refactor. It was a reasonable choice for the PR author to adhere to the existing muddled pattern. I'm inclined to merge as is.
Will merge in #8898. Thanks! |
Type of Changes
Description
When parsing comma-separated lists of regular expressions in the config, ignore
commas that are inside braces since those indicate quantifiers, not delineation
between expressions.
There may be conflicts with #7228, lmk if you'd prefer if this gets merged first or second.
Closes #7229