-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: Access Regex group without allocations #73223
Comments
Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions Issue DetailsBackground and motivationIn class API Proposalnamespace System.Text.RegularExpressions;
public class Match : Group
{
public ReadOnlySpan<char> GetGroupValueSpan(int groupnum);
} API Usagestring text = "One car red car blue car";
string pat = @"(\w+)\s+(car)";
Regex r = new Regex(pat, RegexOptions.IgnoreCase);
Match m = r.Match(text);
ReadOnlySpan<char> word = m.GetGroupValueSpan(1); // matches "One" Alternative Designs
RisksNo response
|
What would be the reason for that? The whole point of this new method is to avoid allocations, but then you also add a version that has an extra allocation? Especially since anyone who does need a
This one does sound useful to me. |
Agree! The idea/question mark came from the fact that we provide both |
Thanks for the proposal @ronaldvdv. Not sure if you are aware, but we added some amortized-allocation free APIs that loop through matches in .NET 7, in particular, we added |
Interesting!! Yes, indeed if we have an amortized-allocation free option, it's nice to extend that to include access to individual captures within the match.
|
Assuming you are using the same regex object to call for the different inputs, then yes. When
That is the tricky part and why it wasn't added yet for 7.0. |
Thanks for providing more background! I still struggle a little bit to understand how the approach described in #65011 would help for the original question in this issue. I hope you don't mind me asking a follow-up here. If we would tell It seems to me we would still need an additional method that skips the creation of |
Not at all, questions are always welcomed 😄
Nothing would change with the internal Match object, that would still have the same fields that track capture data, but remember that this Match object gets reused, so we can't rely on it to get the capture groups data. That means that the way to get this info would have to be through the
It would improve since we wouldn't have to return a
Exactly. With this approach you only allocate |
Background and motivation
In class
Capture
(namespaceSystem.Text.RegularExpressions
) we now have the nice addition of theValueSpan
property which allows me to access the captured text efficiently. However, to be able to access that property I would still need to accessMatch.Groups[num]
which would allocate the fullGroupCollection
and allGroup
instances, which (for my specific use case) defeats the purpose a bit.API Proposal
API Usage
Alternative Designs
string
string
Risks
No response
The text was updated successfully, but these errors were encountered: