Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roslyn analyzer/fixer: Simplify invocations that receive a start/offset and a count/length #35981

Open
carlossanlop opened this issue May 7, 2020 · 11 comments
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Memory code-analyzer Marks an issue that suggests a Roslyn analyzer code-fixer Marks an issue that suggests a Roslyn code fixer
Milestone

Comments

@carlossanlop
Copy link
Member

carlossanlop commented May 7, 2020

Summary

Method invocations that receive integer arguments for a start and a length and apply them on an array or a segment, can get the arguments simplified in two specific cases:

Case 1: Exclude both start and length arguments

Consider this code:

byte[] buffer = { /*...*/ };
ReadOnlyMemory<byte> memory = buffer.AsMemory(0, buffer.length);

Both start and length arguments can be removed when start = 0 and length = buffer.Length:

ReadOnlyMemory<byte> memory = buffer.AsMemory();

Case 2: Exclude the length argument

Consider this code:

byte[] buffer = { /*...*/ };
int offsetValue = /* A number between 0 and buffer.Length */;
ReadOnlyMemory<byte> memory = buffer.AsMemory(offsetValue, buffer.length - offsetValue);

The length argument can be removed when start = offsetValue and length = buffer.Length - offsetValue

ReadOnlyMemory<byte> memory = buffer.AsMemory(offsetValue);

Method list

The list we know so far is the following (can be expanded):

System.ArraySegment<T>.Slice(int index, int count)
System.MemoryExtensions.AsMemory(string text, int start, int length)
System.MemoryExtensions.AsMemory<T>(ArraySegment<T> segment, int start, int length)
System.MemoryExtensions.AsMemory<T>(T[] array, int start, int length)
System.MemoryExtensions.AsSpan(string text, int start, int length)
System.MemoryExtensions.AsSpan<T>(ArraySegment<T> segment, int start, int length)
System.MemoryExtensions.AsSpan<T>(T[] array, int start, int length)
System.Memory<T>.ctor(T[] array, int start, int length)
System.Memory<T>.Slice(int start, int length)
System.ReadOnlySpan<T>.ctor(T[] array, int start, int length)
System.ReadOnlySpan<T>.Slice(int start, int length)
System.ReadOnlyMemory<T>.ctor(T[] array, int start, int length)
System.ReadOnlyMemory<T>.Slice(int start, int length)
System.Span<T>.ctor(T[] array, int start, int length)
System.Span<T>.Slice(int start, int length)

Examples to detect

For case 1:

The value in the start argument can be an integer, a variable, or even a method call, but it must also be part of the length argument, as the value being substracted from the length of the array or segment. Examples:

byte[] buffer = { /*...*/ };
int offset = 5;

Memory<byte> a = buffer.AsMemory(1, buffer.Length - 1);
Span<byte> b = buffer.AsSpan(offset, buffer.Length - offset);
Span<byte> c = buffer.AsSpan().Slice(offset, buffer.Length - offset);
ReadOnlyMemory<byte> d = new ReadOnlySpan<byte>(buffer, GetOffset(), buffer.Length - GetOffset());

The invocations get converted to:

Memory<byte> a = buffer.AsMemory(1);
Span<byte> b = buffer.AsSpan(offset);
Span<byte> c = buffer.AsSpan().Slice(offset);
ReadOnlyMemory<byte> d = new ReadOnlySpan<byte>(buffer, GetOffset());

Note: We can discuss if we would like to exclude method calls in the first version of the analyzer, and include it in a future expansion.

For case 2:

A) The value in the start argument is 0, and the value in the length argument is the length of the array or segment. Examples:

byte[] buffer = { /*...*/ };
Memory<byte> a = buffer.AsMemory(0, buffer.Length);
Span<byte> b = buffer.AsSpan(0, buffer.Length);

The invocations get converted to:

Memory<byte> a = buffer.AsMemory();
Span<byte> b = buffer.AsSpan();

Note: The Slice method does not get modified. See the "Examples to not detect" section below for an explanation.


Examples to not detect

A) The offset and/or length arguments are saved in variables in previous lines. The initial version of this analyzer should not detect this case, but it can be considered for a future expansion.

For case 1:

byte[] buffer = { /*...*/ };
int length = buffer.Length - 1;
buffer.AsMemory(1, length);
byte[] buffer = { /*...*/ };
int offset = 1;
int length = buffer.Length - offset;
buffer.AsMemory(offset, length);

For case 2:

byte[] buffer = { /*...*/ };
int length = buffer.Length ;
buffer.AsMemory(0, length);
byte[] buffer = { /*...*/ };
int offset = 0;
int length = buffer.Length;
buffer.AsMemory(offset, length);

B) For case 2, the Slice methods must be handled in a special way, since they do not have an overload without arguments. We can exclude this case in the initial version of the analyzer, and we can consider it for future expansions.
There are 4 possible routes to take for Slice:

byte[] buffer = { /*...*/ };
Span<byte> a = buffer.AsSpan().Slice(0, buffer.Length);
  • The invocation can be simplified:
Span<byte> a = buffer.AsSpan().Slice(0);
  • Or can get the Slice invocation removed altogether:
Span<byte> a = buffer.AsSpan();
  • Or can warn about an unnecessary Slice (if such rule does not exist yet).
  • Or can do nothing.
@carlossanlop carlossanlop added code-analyzer Marks an issue that suggests a Roslyn analyzer code-fixer Marks an issue that suggests a Roslyn code fixer labels May 7, 2020
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label May 7, 2020
@Dotnet-GitSync-Bot
Copy link
Collaborator

I couldn't figure out the best area label to add to this issue. Please help me learn by adding exactly one area label.

@stephentoub
Copy link
Member

This would apply to AsSpan as well.

@bartonjs bartonjs added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label May 7, 2020
@danmoseley
Copy link
Member

@jeffhandley (or @bartonjs ) when proposing an analyzer, is it sufficient to just label "code-analyzer" and the analyzer crew will see it? Or should there be an "untriaged" column on https://github.com/dotnet/runtime/projects/46 ?

@bartonjs
Copy link
Member

bartonjs commented May 8, 2020

@danmosemsft We're currently using api-suggestion for that, making it similar to API proposals.

@carlossanlop
Copy link
Member Author

carlossanlop commented May 11, 2020

@bartonjs can we consider this a small improvement over the existing rule, or should we consider it a separate api-suggestion that needs to get approved before I start coding it? I have this other issue opened in the roslyn-analyzers repo to improve over the already merged code: dotnet/roslyn-analyzers#3612

@bartonjs
Copy link
Member

If you're tweaking an existing one, that's a tweak. For making a new one (new rule ID, new class, new messages), it's a new one 😄.

It seems conceptually easy for a new one; but you should try to describe it like I did in #33763: What is it going to fix? (Description and examples) What is it not going to fix? (Description and samples).

E.g.

  • It should apply to both AsMemory and AsSpan--is there anything else the pattern generalizes for? (using the ctors directly, for example. Are there other, similar methods?)
  • I don't expect that an initial version would detect when the length was saved to a local variable, so that would be a good "expect to not match". Or, you could put it in a section of "possible future expansion".
  • Similarly, if it was target.AsSpan(x, y) and the previous line was y = target.Length - x
  • For not matching: target.AsSpan(x, y) where y is a method call. Or a literal. Or an unrelated variable. Or whatever. Just something to show "this sort of call doesn't get modified". Which becomes the basis of a negative test.

I think the biggest thing holding back green-lighting a new analyzer is the first bullet... we've already identified that it's not just AsMemory, but also AsSpan, and probably some ReadOnlyMemory/Memory/ReadOnlySpan/Span ctors. Maybe ArraySegment has some similar ctors.
Maybe it should also simplify AsSpan(0) to AsSpan() (repeat across the matrix). And, therefore, target.AsSpan(0, target.Length) to target.AsSpan().

@jeffhandley
Copy link
Member

@danmosemsft We aren't yet regularly triaging analyzer suggestions. I intend for us to do a round of triage on them as we're winding down our effort on the ones marked as .NET 5.0. I don't expect to add any more into 5.0 though, so any new suggestions are most likely to be tagged as Future.

@carlossanlop
Copy link
Member Author

carlossanlop commented May 11, 2020

I think the biggest thing holding back green-lighting a new analyzer is the first bullet

Thanks for the explanation, @bartonjs. Since I already opened dotnet/roslyn-analyzers#3612 to track the specific tweaks we want to add to the already merged Roslyn analyzer for Stream.ReadAsync/WriteAsync, I say we use the current issue to specifically propose the new Roslyn analyzer for the generic case that will cover all the Memory, ReadOnlyMemory, Span, ReadOnlySpan and potentially ArraySegment cases.

Unless there are any objections, I will edit the title and the main description later today, and will explain what is going to be fixed and what is not, per your suggestion.

Edit: Ignore my first sentence. There's no point in working on the tweaks for the merged analyzer. This issue should track all cases.

@danmoseley
Copy link
Member

danmoseley commented May 11, 2020

@jeffhandley sounds good
[edit -- I remember why I asked -- so that we can mark up for grabs that that we would like but don't plan to do.]

@carlossanlop carlossanlop changed the title Roslyn analyzer/fixer: Simplify AsMemory call when start = offset and length = buffer.length - offset Roslyn analyzer/fixer: Simplify invocations that receive a start/offset and a count/length May 12, 2020
@carlossanlop
Copy link
Member Author

Description updated.

@bartonjs
Copy link
Member

The value in the start argument can be an integer, a variable, or even a method call, but it must also be part of the length argument, as the value being substracted from the length of the array or segment.

Method calls, or anything involving a member-access expression that isn't a field-access expression, are slightly dangerous to replace, because they could have side effects. Reducing two calls to one can thus change the behavior of a program. That might mean that the fixer batch ID needs to be different for member-access replacement, so someone can bulk apply all of the safe fixes then manually walk through the others.

public int GetLength()
{
    _callCount++;
    return _length;
}

@adamsitnik adamsitnik added this to the Future milestone Jul 6, 2020
@adamsitnik adamsitnik removed the untriaged New issue has not been triaged by the area owner label Jul 6, 2020
@stephentoub stephentoub modified the milestones: Future, 8.0.0 Nov 17, 2022
@tannergooding tannergooding modified the milestones: 8.0.0, Future Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Memory code-analyzer Marks an issue that suggests a Roslyn analyzer code-fixer Marks an issue that suggests a Roslyn code fixer
Projects
None yet
Development

No branches or pull requests

8 participants