CS8509 - Consider Exhaustiveness For Algebraic Data Types #7854

MelbourneDeveloper · 2024-01-16T20:19:36Z

Analyzer

Diagnostic ID: CS8509

Background

C# doesn't have direct support for Algebraic Data Types (ADTs). But, ADTs are a fundamental aspect of Functional Programming and the lack of exhaustiveness weakens pattern matching based on type in C#. Given that C# is adopting more and more FP constructs, C# should have support for ADTs. However, there is a workaround that achieves ADTs with the existing C# constructs. Here is an example.

namespace Shapes;

public abstract class Shape
{
    /// <summary>
    /// Private constructor, which means nothing outside this class can inherit from this class
    /// </summary>
    private Shape()
    {
    }

    public sealed class Circle(double radius) : Shape
    {
        public double Radius { get; } = radius;
        public double Area() => Math.PI * Radius * Radius;
    }

    public sealed class Rectangle(double width, double height) : Shape
    {
        public double Width { get; } = width;
        public double Height { get; } = height;
        public double Area() => Width * Height;
    }

    public sealed class Triangle(double @base, double height) : Shape
    {
        public double Base { get; } = @base;
        public double Height { get; } = height;
        public double Area() => 0.5 * Base * Height;
    }
}

Note that it impossible to inherit from this class because the constructor is private. That means that matching on subclass is exhaustive (unless I've missed something).

So, the language itself supports ADTs in this way, but CS8509 doesn't recognize that pattern matching can actually do exhaustive checking without the fall back case. Here is what we see.

Furthermore, ADTs massively improve the reliability of code. Consider this default arm. It always results in an exception, even though we may be dealing with a fixed number of subclasses. ADTs promote the ability to cover all possible code paths without throwing Exceptions.

    public double CalculateArea(Shape shape) =>
        shape switch
        {
            Shape.Circle c => Math.PI * c.Radius * c.Radius,
            Shape.Rectangle r => r.Width * r.Height,
            Shape.Triangle t => 0.5 * t.Base * t.Height,
            _ => throw new ArgumentOutOfRangeException(nameof(shape), shape, null)
        };

Describe the improvement

The analyzer should evaluate whether or not it is actually possible to subclass the type. If the type has a private constructor, it is not possible to further subclass the type, so exhaustiveness is achieved. There is no need for a _ case. Also, the _ case is impossible so perhaps there should be a new rule that prohibits the _ when there is already exhaustiveness.

Describe suggestions on how to achieve the rule

Check the type to see if the constructor is private.
Check that all sub types are sealed

If so, the analyzer can assume that the only available type matches are those that exist on the class currently.

Additional context

See the Wikipedia article on ADTs for more information.

The text was updated successfully, but these errors were encountered:

jaredpar · 2024-01-18T03:17:43Z

Note that it impossible to inherit from this class because the constructor is private. That means that matching on subclass is exhaustive (unless I've missed something).

One item to keep in mind is that the assembly you compile against can be very different than the assembly that you run against. Consider that the code pasted in this issue represents 1.0.0.0 of an assembly. It's perfectly valid that in a later version, say 2.0.0.0, that a new subtype of Shape will be added. This is part of the reason such checks are not considered exhaustive. At runtime there is always the possibility that another type could be available even if it wasn't visible at compile time.

The other issue is that there is no affirmative declaration from the developer here that the intent of this construct is to be exhaustive. It could be that the developer intends to evolve this type in the future and add more subtypes at which point considering matches exhaustive is counterproductive. ADT should be declarative, not inferred from existing structure.

These may seem like esoteric concerns but versioning concerns are a very real problem when it comes to ADT.

Overall though this is a language design issue, not a compiler one so moving to csharplang.

MelbourneDeveloper · 2024-01-18T05:33:50Z

Hi @jaredpar

One item to keep in mind is that the assembly you compile against can be very different than the assembly that you run against. Consider that the code pasted in this issue represents 1.0.0.0 of an assembly. It's perfectly valid that in a later version, say 2.0.0.0, that a new subtype of Shape will be added. This is part of the reason such checks are not considered exhaustive. At runtime there is always the possibility that another type could be available even if it wasn't visible at compile time.

That is an interesting point. Yes, it is possible to load or unload assemblies dynamically in .NET, so I can see your point.

The other issue is that there is no affirmative declaration from the developer here that the intent of this construct is to be exhaustive. It could be that the developer intends to evolve this type in the future and add more subtypes at which point considering matches exhaustive is counterproductive. ADT should be declarative, not inferred from existing structure.

🤔

I can see an argument for that, but this isn't the case with other languages. The reason I noticed this in C# was from recent experience with Dart exhaustiveness checking on pattern matching was recently added.

Both these points make me wonder how F# deals with this. Exhaustiveness checking is a core part of pattern matching in F# and FP languages in general. But, now that you mention this, I wonder what happens when the assembly changes at runtime with F#?

jaredpar · 2024-01-18T05:51:45Z

HI @MelbourneDeveloper,

I can see an argument for that, but this isn't the case with other languages.

Depends on the language but in general I find they're declarative. Take F# as an example, their union type is very much declarative.

type Option<'a> =
    | Some of 'a
    | None

The | is a declarative syntax for defining a discriminated union.

Rust likewise uses a declarative syntax

union MyUnion {
    f1: u32,
    f2: f32,
}

There have been several union proposals / discussions for C# that follow roughly the pattern you have here. Essentially a way of marking a type as "this assembly only" and then treating derived types as union members. Those proposals though generally have a specific action that mark the types as being part of a union.

Both these points make me wonder how F# deals with this.

F# actually recommends that you avoid unions in public APIs for these very reasons, unless your quite certain they will never evolve. Once a union is public it can be hard to evolve because of the binary and source compat implications. There have been some recent proposals to make this easier to deal with that are an interesting read.

MelbourneDeveloper · 2024-01-18T06:10:50Z

Yes, F#'s approach is great because it has good support for discriminated unions

But Dart also introduced exhaustiveness checks to implement algebraic data types in this way

https://dart.dev/language/patterns#algebraic-data-types

My opinion is that the C# analyzer should have an option, or there should be multiple different rules, depending on how you want to use them. The problem with the current situation is that there is no rule to tell you when you've implemented the basics.

If I know I've implemented all the cases I intend, it should compile, but if I've left off something, it should tell me

MelbourneDeveloper · 2024-01-18T06:21:00Z

The issue is not limited to type checking. It's also an issue for enums, and always has been.

@jaredpar

I do realize that you can cast 0 to TestEnum and pass it around, which is why the default case is necessary, but still, the existing rules don't give me any indication of whether I've actually supplied all the cases for the enum.

I think it would make more sense if there was another less strict rule that doesn't come with a guarantee that the case is covered, but comes with a reasonable first pass check. This is why I though the Roslyn analyzers were the right place for this.

Ultimately, people are throwing exceptions in these default cases, and this is the problem that we should be avoiding. It doesn't make a lot of sense to throw an exception when we actually know all the reasonable cases.

The worst thing that can happen if the cases are different at runtime is an exception, so it doesn't seem to me that we're risking anything by depending on a rule that detects exhaustiveness - even in a slightly loose sense.

MelbourneDeveloper · 2024-01-18T06:31:53Z

@jaredpar

The point is not ultimately so much about ADTs. It's that the compiler doesn't give you any help with covering all cases.

It's actually quite dangerous because adding a new case doesn't break the build because you're always forced to add a default case.

To your point about the definition evolving...

That's kind of the point. If the possible cases evolve right now, you won't see any compiler errors. You'll just hit the default case, which is often an exception and could wreck your app.

MelbourneDeveloper · 2024-01-18T08:04:38Z

@jaredpar one more slightly related point...

If .NET doesn't already have it, it should have some kind of automatic version compatibility constraint

At the moment, we can specify which versions are compatible at the NuGet package level, but this is extremely fallible. There are a lot of cases where versions won't be compatible - and that can be detected with static analysis

One example would be cases where an exhaustiveness check fails. It couldn't cover the case where the assembly is loaded at runtime, but it would solve most of the problems.

If you add an incompatible version to your project, the NuGet pull would fail. It would mean that the process would need to do further checks after pulling the packages.

Dart does this with null soundness. If I am using a modern version of Dart and I try to pull a package with a Dart version that doesn't have null soundness, it just fails. This is necessary for the integrity of null soundness in recent Dart versions.

HaloFour · 2024-01-18T12:45:27Z

@MelbourneDeveloper

The compiler does provide exhaustiveness checks for some cases, like bool or covering the full range of integral values, etc. It just doesn't make assumptions around the shape of type hierarchies, and it assumes that enums can and will be used as flags (or in other ways).

If you've noticed, the compiler does emit two separate warnings depending on whether a switch over an enum doesn't covers all known values (CS8509) vs. doesn't cover the full range of the underlying integral type (CS8524). You can treat those diagnostics differently. Maybe one can make the argument that these pseudo-ADTs could do the same thing.

I'm with @jaredpar though that I think that ADTs should be an intentional feature. Here's hoping that there's movement on that proposal soon.

333fred · 2024-01-18T15:59:22Z

Closing as a duplicate of #113.

jaredpar · 2024-01-18T16:20:44Z

@MelbourneDeveloper

My opinion is that the C# analyzer should have an option, or there should be multiple different rules, depending on how you want to use them.

There is nothing stopping you from writing an analyzer that does this type of analysis. If you want to exclude the exhaustiveness warning you could add a diagnostic suppressor to your analyzer as well. That should give you the experience that you're looking for here.

.. the existing rules don't give me any indication of whether I've actually supplied all the cases for the enum

Don't agree with that. Consider this example:

public class C {
    public bool M(E e) =>
        e switch {
          E.V1 => true,
                
        };
}

public enum E { V1, V2 }

Here the compiler both warns there is a missing value and gives examples of the values it thinks are missing:

warning CS8509: The switch expression does not handle all possible values of its input type (it is not exhaustive). For example, the pattern 'E.V2' is not covered.

Ultimately, people are throwing exceptions in these default cases, and this is the problem that we should be avoiding. It doesn't make a lot of sense to throw an exception when we actually know all the reasonable cases.

The compiler forces a default case because it recognizes that more cases are possible. The reality here is that an enum has as many cases as the underlying numeric type they are based on. That means even the simple E type I defined above has 2^32 possible values even though I only defined 2. That is just a reality of the way the .NET runtime works and the C# language recognizes that.

F# takes a very similar approach here. Consider this example:

type E =
    | V1 = 1
    | V2 = 2
    
let f x = 
    match x with 
    | E.V1 -> "one"
    | E.V2 -> "two"
    
printfn "%s" (f E.V2)

F# issues a warning here because it recognizes that the match x expression is not exhaustive due to the nature of .NET enum.

That's kind of the point. If the possible cases evolve right now, you won't see any compiler errors. You'll just hit the default case, which is often an exception and could wreck your app.

That is part of the reason why the compiler provides two different warnings here. When using a switch expression over an enum there are two kinds of non-exhaustive:

Did not exhaust all defined values: this issues CS8509
Did not exhaust all possible values: this issues CS8524

If you are just concerned with defined values and don't want to deal with all possible values then disable CS8524 in the project.

MelbourneDeveloper · 2024-01-18T18:58:43Z

@333fred

I am not asking for discriminated unions here, and this is not a duplicate of #113. I originally logged this in the Roslyn analyzers repo.

Closing as a duplicate of #113.

@jaredpar

That is part of the reason why the compiler provides two different warnings here. When using a switch expression over an enum there are two kinds of non-exhaustive:

Did not exhaust all defined values: this issues CS8509
Did not exhaust all possible values: this issues CS8524

I wish I'd known about CS8509 right at the start of the thread, and I wish this discussion hadn't been moved out of the Roslyn repo, because it was never a language design discussion in the first place.

The CS8509 is exactly what I'm looking for and this analyzer rule would only require a slight tweak to achieve what I'm talking about. It already handles missing enums:

However, it doesn't cover cases where type checking, for most intents and purposes is exhaustive:

I understand all the points you've raised here, and not only do I agree, I also see that it's clear that others have already thought and worked through this before I even mentioned it. That's why there are two separate analyzer rules, which is what I actually recommended.

My only request for this issue is that CS8509 be optionally expanded so that it's possible to achieve pseudo ADTs without waiting for the full feature to land in C#. For the record, I would have added an option to CS8509 to ignore exhaustiveness by type, and add a 3rd rule that does a check based on pseudo ADTs. This may have been something the community could build, but seeing this topic got moved here, and is closed now, that ship seems to have sailed.

jaredpar · 2024-01-18T19:24:38Z

My only request for this issue is that CS8509 be optionally expanded so that it's possible to achieve pseudo ADTs without waiting for the full feature to land in C#.

That is probably unlikely to happen. The specific example you have there is pretty clear about exhaustion. But the more general case is not. What if it's an internal ctor and there is no IVT in the system? What about private protected ctors? Or more interestingly what happens when a reference assembly excludes the private ctor? Whether private members do / don't need to be in reference assemblies is a bit of a gray area and some tools exclude it.

This is yet another reason why we prefer this be declarative. Types that want ADT behavior should be declarative about it. Doesn't have to be a language change, could be as simple as an attribute.

At the same time though, there is nothing stopping the community from building an analyzer that does this. The analyzer could detect what it believes are ADT and then suppress the CS8524 diagnostics in cases that ADT exhaustion has occurred.

333fred · 2024-01-18T19:27:34Z

@MelbourneDeveloper I would consider a duplicate of both the request for discriminated unions (#113) and this one: #4032. Perhaps 4032 is the better match I should have linked to start with.

MelbourneDeveloper · 2024-01-18T19:28:50Z

Doesn't have to be a language change, could be as simple as an attribute.

@jaredpar

Just for the record, enums have an attribute for flags, but the analyzer ignores the absence of that.

This is yet another reason why we prefer this be declarative. Types that want ADT behavior should be declarative about it.

Anyway, I understand, and I'd like to see a declarative approach to ADTs in future, but I still think that an extra analyzer, or tweak to the existing one would go a long way.

jaredpar · 2024-01-18T19:35:32Z

Just for the record, enums have an attribute for flags, but the analyzer ignores the absence of that.

Correct. That is because it's hard to reason out which combination of flags are valid and thus create a reasonable exhaustive list. Instead it's treated more like a value which doesn't have a defined set of known values.

shuebner · 2024-01-23T07:58:43Z

@MelbourneDeveloper
FYI A while ago I wrote a diagnostic suppressor for the closed type hierarchy case in your initial post:
https://github.com/shuebner/ClosedTypeHierarchyDiagnosticSuppressor

MelbourneDeveloper · 2024-01-23T08:08:55Z

@shuebner amazing!

I will check it out when I get a chance

MelbourneDeveloper mentioned this issue Jan 16, 2024

Analyzer Support for Exhaustiveness in Pattern Matching Type Checking messerli-informatik-ag/code-style#91

Open

mavasani transferred this issue from dotnet/roslyn-analyzers Jan 17, 2024

dotnet-issue-labeler bot added Area-Compilers untriaged labels Jan 17, 2024

jaredpar transferred this issue from dotnet/roslyn Jan 18, 2024

jaredpar removed untriaged Area-Compilers labels Jan 18, 2024

333fred closed this as not planned Won't fix, can't repro, duplicate, stale Jan 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CS8509 - Consider Exhaustiveness For Algebraic Data Types #7854

CS8509 - Consider Exhaustiveness For Algebraic Data Types #7854

MelbourneDeveloper commented Jan 16, 2024 •

edited

Loading

jaredpar commented Jan 18, 2024 •

edited

Loading

MelbourneDeveloper commented Jan 18, 2024

jaredpar commented Jan 18, 2024 •

edited

Loading

MelbourneDeveloper commented Jan 18, 2024

MelbourneDeveloper commented Jan 18, 2024 •

edited

Loading

MelbourneDeveloper commented Jan 18, 2024 •

edited

Loading

MelbourneDeveloper commented Jan 18, 2024 •

edited

Loading

HaloFour commented Jan 18, 2024

333fred commented Jan 18, 2024

jaredpar commented Jan 18, 2024

MelbourneDeveloper commented Jan 18, 2024

jaredpar commented Jan 18, 2024

333fred commented Jan 18, 2024

MelbourneDeveloper commented Jan 18, 2024

jaredpar commented Jan 18, 2024

shuebner commented Jan 23, 2024

MelbourneDeveloper commented Jan 23, 2024

CS8509 - Consider Exhaustiveness For Algebraic Data Types #7854

CS8509 - Consider Exhaustiveness For Algebraic Data Types #7854

Comments

MelbourneDeveloper commented Jan 16, 2024 • edited Loading

Analyzer

Background

Describe the improvement

Describe suggestions on how to achieve the rule

Additional context

jaredpar commented Jan 18, 2024 • edited Loading

MelbourneDeveloper commented Jan 18, 2024

jaredpar commented Jan 18, 2024 • edited Loading

MelbourneDeveloper commented Jan 18, 2024

MelbourneDeveloper commented Jan 18, 2024 • edited Loading

MelbourneDeveloper commented Jan 18, 2024 • edited Loading

MelbourneDeveloper commented Jan 18, 2024 • edited Loading

HaloFour commented Jan 18, 2024

333fred commented Jan 18, 2024

jaredpar commented Jan 18, 2024

MelbourneDeveloper commented Jan 18, 2024

jaredpar commented Jan 18, 2024

333fred commented Jan 18, 2024

MelbourneDeveloper commented Jan 18, 2024

jaredpar commented Jan 18, 2024

shuebner commented Jan 23, 2024

MelbourneDeveloper commented Jan 23, 2024

MelbourneDeveloper commented Jan 16, 2024 •

edited

Loading

jaredpar commented Jan 18, 2024 •

edited

Loading

jaredpar commented Jan 18, 2024 •

edited

Loading

MelbourneDeveloper commented Jan 18, 2024 •

edited

Loading

MelbourneDeveloper commented Jan 18, 2024 •

edited

Loading

MelbourneDeveloper commented Jan 18, 2024 •

edited

Loading