Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: Generic Enumerable.Range method #97689

Open
ladeak opened this issue Jan 30, 2024 · 36 comments · May be fixed by #106925
Open

[API Proposal]: Generic Enumerable.Range method #97689

ladeak opened this issue Jan 30, 2024 · 36 comments · May be fixed by #106925
Assignees
Labels
api-approved API was approved in API review, it can be implemented area-System.Linq in-pr There is an active PR which will close this issue when it is merged
Milestone

Comments

@ladeak
Copy link
Contributor

ladeak commented Jan 30, 2024

Background and motivation - UPDATED 2024/01/31

So that Enumerable.Range can be used with anything that implements IBinaryInteger<T>.

API Proposal

public static class Enumerable
{
    public static IEnumerable<T> Range<T>(T start, int count) where T : IBinaryInteger<T>
}

API Usage

public record struct MyInterval<T> where T : IBinaryInteger<T>
{
    public T Start { get; }
    public T End { get; }

    // ...
    
    public IEnumerable<T> GetAll() => Enumerable.Range(Start, End);

Alternative Designs

An alternative could be to handle all INumber<T>, in this case having a 3rd increment parameter could be reasonable

public static class Enumerable
{
    public static IEnumerable<T> Range<T>(T start, T end, T increment) where T : INumber<T>
}

Risks

The original proposal with 2 parameters overload for the Range method could be source-breaking, as the following case returns an IEnumerable<int> and not IEnumerablem<byte> today:

byte start = 0, count = 5;
IEnumerable<int> e = Enumerable.Range(start, count);
@ladeak ladeak added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Jan 30, 2024
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jan 30, 2024
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jan 30, 2024
@huoyaoyuan huoyaoyuan added area-System.Linq and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Jan 30, 2024
@ghost
Copy link

ghost commented Jan 30, 2024

Tagging subscribers to this area: @dotnet/area-system-linq
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

So that Enumerable.Range can be used with anything that implements IBinaryInteger<T>.

API Proposal

public static class Enumerable
{
    public static IEnumerable<T> Range<T>(T start, T end) where T : IBinaryInteger<T>
}

API Usage

public record struct MyInterval<T> where T : IBinaryInteger<T>
{
    public T Start { get; }
    public T End { get; }

    // ...
    
    public IEnumerable<T> GetAll() => Enumerable.Range(Start, End);

Alternative Designs

No response

Risks

No response

Author: ladeak
Assignees: -
Labels:

api-suggestion, area-System.Linq, untriaged

Milestone: -

@eiriktsarpalis eiriktsarpalis added this to the Future milestone Jan 30, 2024
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Jan 30, 2024
@eiriktsarpalis
Copy link
Member

cc @tannergooding

@tannergooding
Copy link
Member

@eiriktsarpalis this seems reasonable overall to me.

There's potentially a desire to support more than just IBinaryInteger and to support any INumber. That would allow you to do effectively T current = start; while (current < end) { yield return current++; }. However, there are some edge cases to consider for float (current++ may yield back current for inf, nan, or large finite values) but the relevant APIs to handle such cases do exist on the INumber type.

It may even be desirable to allow users to specify a custom step (that is, give a range from start to end but in increments of 2 or 4 instead of 1; but that might also be an API with a different name.

@stephentoub
Copy link
Member

This could also be a source-breaking change, right? e.g. today if I have:

IEnumerable<int> e = Enumerable.Range((byte)0, (byte)5);

and then we add the proposed API, that will fail to compile because it won't be able to cast the resulting IEnumerable<byte> to IEnumerable<int>.

@tannergooding
Copy link
Member

Yes, but that is also a simple fix and the compiler is (by default) already raising a diagnostic saying the cast is redundant.

@stephentoub
Copy link
Member

and the compiler is (by default) already raising a diagnostic saying the cast is redundant.

Not if it were instead:

byte start = 0, count = 5;
IEnumerable<int> e = Enumerable.Range(start, count);

Not saying it's a deal-breaker, simply pointing out that this has more source-breaking consequences than come with typical static API additions.

@Clockwork-Muse
Copy link
Contributor

.... that case is likely too simple, anyhow, since the Range is almost certainly being immediately transformed into something else. Whether that ends up breaking is more complicated - eg, using ToDictionary(...) and passing the results to some function is likely breaking.
Ironically the more complicated cases that add the range value to a pre-defined domain type are likely less breaking, because auto upcasting would occur at that point....

@ladeak
Copy link
Contributor Author

ladeak commented Jan 31, 2024

Let me update the proposal with the suggestions:

Including all numbers (I think for floating point numbers, it would make sense to provide the increment) as @tannergooding suggest:

public static class Enumerable
{
    public static IEnumerable<T> Range<T>(T start, T count, T increment) where T : INumber<T>
}

I believe, this API would not be a breaking change anymore. I will also capture the concern of breaking change by @stephentoub in the proposal.

@eiriktsarpalis
Copy link
Member

Seems reasonable, we can try reviewing it.

@eiriktsarpalis eiriktsarpalis added api-ready-for-review API is ready for review, it is NOT ready for implementation and removed api-suggestion Early API idea and discussion, it is NOT ready for implementation labels Jan 31, 2024
@eiriktsarpalis eiriktsarpalis self-assigned this Jan 31, 2024
@eiriktsarpalis eiriktsarpalis added api-suggestion Early API idea and discussion, it is NOT ready for implementation and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Jan 31, 2024
@eiriktsarpalis
Copy link
Member

On second thought, I find it kind of odd that we have a generic count parameter. What does it even mean to write Enumerable.Range(start: 0, count: 3.1, increment: 0.1)? I think this should be changed so that either:

  1. The count parameter is of type int or
  2. Change the second parameter to denote the end of the range.

My personal preference would be for option (2) since it's consistent with similar APIs available in Matlab, Python, or F#:

public static class Enumerable
{
    public static IEnumerable<T> Range<T>(T start, T end, T step) where T : INumber<T>
    {
        // TODO determine if end is exclusive or inclusive upper bound
        for (T current = start; current <= end; current += step)
        {
            yield return current;
            current += step;
        }
    }
}

I suspect this also implies that we should adopt a different name for the method, since it changes the semantics of the second parameter when compared to the existing Range method.

@ladeak
Copy link
Contributor Author

ladeak commented Jan 31, 2024

Yes, I changed from end to count because I realized the current Range method uses count and given this is an overload I was worried if it is going to be confusing or not. Let me update the proposal as suggested and change it back to end (Option 2).

@eiriktsarpalis eiriktsarpalis added api-ready-for-review API is ready for review, it is NOT ready for implementation and removed api-suggestion Early API idea and discussion, it is NOT ready for implementation labels Jan 31, 2024
@viceroypenguin
Copy link

To share additional information from community implementations, there is:

  • Sequence (docs), which is equivalent of what has been described here, with start, end, and step, with end being inclusive.
  • Range (docs), which keeps the start, count, and step parameters, which would generate count number of items, adding step for each one.

@Clockwork-Muse
Copy link
Contributor

.... given that end is likely to be practically exclusive for most floating-point inputs (may be surprising when it ends up being inclusive), I would prefer making it explicitly exclusive.

Note that we should make it safe to use negative steps, where end is negative...

@tannergooding
Copy link
Member

I would expect end to be exclusive so that M(start: 0, end: Length, step: 1) works as "expected" and so that it matches the vast majority of other decisions .NET has made with regards to using an inclusive start and exclusive end.

Exclusive end, overall, makes the code simpler and more intuitive for a large number of scenarios and makes it work more naturally with arrays, ranges, indexing, and other expressions that are typical for .NET developers. It also typically makes the code cheaper to actually execute and to do other common computations such as determining the length (which is simply end - start)

It could also simply be start, count, step (in which case its then consistent, parameter-wise, with the existing Range API) and that would also work for many of the same purposes.

@svick
Copy link
Contributor

svick commented Feb 1, 2024

Should the constraint be where T : INumber<T> or the potentially more general where T : IAdditionOperators<T, T, T>, IComparisonOperators<T, T, bool>?

@eiriktsarpalis
Copy link
Member

It could also simply be start, count, step (in which case its then consistent, parameter-wise, with the existing Range API) and that would also work for many of the same purposes.

That seems a bit counterintuitive though, in most cases I'd have to run a division in my head to determine the appropriate count value.

Should the constraint be where T : INumber or the potentially more general where T : IAdditionOperators<T, T, T>, IComparisonOperators<T, T, bool>?

We could, but it in my view complicates the API signature without a clear use case. Can you define intervals in types that do not conform with the INumber<T> abstraction?

@viceroypenguin
Copy link

It could also simply be start, count, step (in which case its then consistent, parameter-wise, with the existing Range API) and that would also work for many of the same purposes.

That seems a bit counterintuitive though, in most cases I'd have to run a division in my head to determine the appropriate count value.

It may be counter-intuitive to what's desired, but I think the alternative is counter-intuitive to the current API.

For example, if I look at the following code:

var seq1 = Enumerable.Range(2, 3);
var seq2 = Enumerable.Range(2, 3, 1);

What reason do I have to expect that seq1.SequenceEqual(seq2) == false? From a readability standpoint, I think Enumerable.Range(int, int) should have the same behavior as Enumerable.Range(int, int, 1);

Perhaps a different name would be helpful if [start..end) is the desired sequence of values?

@eiriktsarpalis
Copy link
Member

Correct, which is why using a different name might be necessary.

@tannergooding
Copy link
Member

That seems a bit counterintuitive though, in most cases I'd have to run a division in my head to determine the appropriate count value.

You're likely going to end up doing some mental arithmetic no matter what you do and I imagine that many usages will include some attached comment like // [2, 3). There isn't one true answer here, with many languages picking different options; so I believe it is better for .NET to be consistent with how .NET typically does things (which is either taking a count or exclusive end).

The inclusive start and exclusive end is also how many other of the most widely used/popular languages have opted to do this, including python's range(), C/C++ iterators, Rust's step_by on their Range<Idx> type, etc. It is also how I would expect C#'s existing range syntax would end up working if it was ever extended (as some languages support) to allow a step to be provided. The general consistency with other C family of languages and languages like Python which are widely popular for ML/AI also makes it a good choice for people looking to integrate their code with .NET

@viceroypenguin
Copy link

viceroypenguin commented Feb 1, 2024

Proposal

maybe this?

public static class Enumerable
{
    public static IEnumerable<T> Range<T>(T start, int count) where T : INumber<T>;
    public static IEnumerable<T> Range<T>(T start, int count, T step) where T : INumber<T>;
    public static IEnumerable<T> Sequence<T>(T start, T end, T step) where T : INumber<T>;
}

@ladeak
Copy link
Contributor Author

ladeak commented Feb 1, 2024

If you have an int count you are limiting the range for example for long values.

@ladeak
Copy link
Contributor Author

ladeak commented Feb 1, 2024

Correct, which is why using a different name might be necessary.

I do agree, maybe Sequence could be a better name, but it would probably need 2 overloads (the one proposed and one without the increment parameter) which could be increment by T.One similly as the existing Range today.

@tannergooding
Copy link
Member

The only thing I don't like about the two methods approach, is that intuitively I would expect that Range is the one taking start, end and Sequence is the one taking start, count (especially given how range expressions work in most languages). But, it's too late to fix that now.

@terrajobst
Copy link
Member

terrajobst commented Aug 6, 2024

Video

  • Might result in a source breaking change for cases where the first parameter isn't typed as int but implicitly convert to int, such as char and short. With this API, the return value changes from IEnumerable<int> to IEnumerable<TypeOfFirstParameter>
    • We believe it's an acceptable source breaking change
  • Enumerable.Range with a step is also interesting but has nuance that would need further discussion
    • As proposed it's not as expressive as Range(T start, int count, T step) would be
public static class Enumerable
{
    public static IEnumerable<T> Range<T>(T start, int count) where T : IBinaryInteger<T>
}

@terrajobst terrajobst added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Aug 6, 2024
@julealgon
Copy link

This isn't going to make it to .NET9, right?

@eiriktsarpalis eiriktsarpalis modified the milestones: Future, 10.0.0 Aug 7, 2024
@eiriktsarpalis
Copy link
Member

That's right, this now falls into the .NET 10 bucket.

@alrz
Copy link
Member

alrz commented Aug 8, 2024

Any chance this could be considered alongside #64031?

@eiriktsarpalis
Copy link
Member

I think they could be kept separate.

@terrajobst
Copy link
Member

Any chance this could be considered alongside #64031?

For consistency it's sometimes better to lump things together in order to ensure we produce a solution that makes sense across the board. For cases like this, where we can add things piecemeal without complexity explosion, keeping them separate helps to get easy stuff done without being held hostage by harder work.

@AlexRadch AlexRadch linked a pull request Aug 24, 2024 that will close this issue
@dotnet-policy-service dotnet-policy-service bot added the in-pr There is an active PR which will close this issue when it is merged label Aug 24, 2024
@AmrAlSayed0
Copy link

AmrAlSayed0 commented Aug 25, 2024

I'm late to the party but technically public static IEnumerable<T> Range<T>(T start, int count) should only need IIncrementOperators<T> as a constraint and public static IEnumerable<T> Range<T>(T start, int count, T step) should only need System.Numerics.IAdditionOperators<T,T,T>. This would make it much more generic and will allow it work with types that are not numbers.

@AlexRadch
Copy link
Contributor

AlexRadch commented Aug 25, 2024

I'm late to the party but technically public static IEnumerable<T> Range<T>(T start, int count) should only need IIncrementOperators<T> as a constraint and public static IEnumerable<T> Range<T>(T start, int count, T step) should only need System.Numerics.IAdditionOperators<T,T,T>. This would make it much more generic and will allow it work with types that are not numbers.

Yes! I can rewrite the implementation to such a public API. It must first be changed and approved.

@tannergooding
Copy link
Member

It is not necessarily always ideal to make it the least constrained possible.

It is largely intentional that the approved shape is only:

public static IEnumerable<T> Range<T>(T start, int count) where T : IBinaryInteger<T>

This is in part because what this does can be non-sensible for other types, such as floating-point where increment may not mutate the value.

API review discussed exposing additional less constrained APIs in the future, such as some Sequence API that allows more types/control and which does not have any particular expectations like Range might have.

@AlexRadch
Copy link
Contributor

This is in part because what this does can be non-sensible for other types, such as floating-point where increment may not mutate the value.

If one is multiplied by the number of steps and added to the start, then the implementation will count correctly for any numbers.

API review discussed exposing additional less constrained APIs in the future, such as some Sequence API that allows more types/control and which does not have any particular expectations like Range might have.

The only additional thing the current implementation does is check for no range overflow.
We can move on to the general case for all numbers if we remove this check.

@tannergooding
Copy link
Member

If one is multiplied by the number of steps and added to the start, then the implementation will count correctly for any numbers.

This will not do the right thing in all cases and it may be unexpected in others. It may be unexpected for a user to see 16777216.0, 16777216.0, 16777218.0 for Range(16777216.0f, 3) for example

It was very explicitly intentional that the API was approved in the shape it was. We can relax this in the future if we decide that is the best case scenario. But that would only occur after fully considering the rest of the scenarios, what it means for arbitrary types, etc.

@AlexRadch
Copy link
Contributor

It was very explicitly intentional that the API was approved in the shape it was. We can relax this in the future if we decide that is the best case scenario. But that would only occur after fully considering the rest of the scenarios, what it means for arbitrary types, etc.

I fully support an extension to more general types and a step argument. This will allow the method to be used for decimal numbers, there are no problems with rounding.
This will also allow you to create descending sequences and repeat the same number the required number of times.

@tannergooding
Copy link
Member

This will allow the method to be used for decimal numbers, there are no problems with rounding.

decimal numbers notably still have the same rounding issues that float and double have. They are still a value with finite precision and range trying to represent data that can have infinite precision and unbounded range.

(1.0m / 3.0m) * 3.0m != 1.0m being one of the simplest examples.

Likewise you can have 1_000_000_000_000_000.0m + 0.000_000_000_000_000_1m == 1_000_000_000_000_000.0m (no change) because it would require more than 28 significant digits to represent.

The only real difference between decimal and double is that representable values are multiples of powers of 10, rather than multiples of powers of 2, and so when you write a literal, that is likely the exact literal you're using in a given computation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-approved API was approved in API review, it can be implemented area-System.Linq in-pr There is an active PR which will close this issue when it is merged
Projects
None yet
Development

Successfully merging a pull request may close this issue.