Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Dialyzer and ETC cross checks #429

Merged
merged 6 commits into from
Jul 14, 2022

Conversation

erszcz
Copy link
Collaborator

@erszcz erszcz commented Jul 13, 2022

This adds a few changes to facilitate comparisons between existing Erlang type checkers:

  • a new Make rule to enable running Dialyzer on Gradualizer tests
  • renames test modules to avoid name clashes - when running Dialyzer on multiple modules, their names have to be unique; it was not the case across should_pass / should_fail / ...
  • ignores *.erltypes files generated by ETC

All in all, there are no functional changes to the type checker itself.

Intermediate comparison results are available at https://gist.github.com/erszcz/4d43a77464c87a514e71eecf2811af63.

@erszcz erszcz requested a review from zuiderkwast July 13, 2022 14:51
@erszcz erszcz changed the title Dialyzer and ETC cross checks Enable Dialyzer and ETC cross checks Jul 13, 2022
Copy link
Collaborator

@zuiderkwast zuiderkwast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! What's the results? Any dialyzer errors found?

Comment on lines +215 to +218
.PHONY: dialyze-tests
dialyze-tests: app $(DIALYZER_PLT)
dialyzer $(DIALYZER_OPTS) $(test_data_erls)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we run this in CI to maintain expected results? I.e. to prevent name clashes from being introduced again.

Are there any dialyzer errors in the tests?

Copy link
Collaborator Author

@erszcz erszcz Jul 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, we probably could, but the errors do not overlap completely, so we would need to come up with some should-pass/should-fail/known-problems mapping mechanism. I don't think we can run it just to detect name clashes, we would also have to check the error code and determine which is returned why.

Below is an interesting case:

-spec f1() -> boolean().
f1() ->
true andalso g1().
-spec g1() -> any().
g1() -> 3.

Dialyzer reports:

andalso_any.erl:5:2: Invalid type specification for function andalso_any:f1/0. The success typing is
          () -> 3

Whereas Gradualizer is fine with it, because any() is compatible with every other type. AFAIU a full-blown gradual typing system according to Siek and Taha would inject a dynamic check in such a place.

Dialyzer infers the type of an integer literal, but it couldn't do so if the value was not a literal.
Gradualizer just believes the spec, even with --infer, and checks against it, so it assumes everything is fine.

I imagine there are more cases like this, but haven't analysed all of them yet. The current summary is available at https://gist.github.com/erszcz/4d43a77464c87a514e71eecf2811af63#file-check-2022-07-13_170833-tsv.

Copy link
Collaborator

@zuiderkwast zuiderkwast Jul 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. Yes, dializer does a full inference, while gradualizer is silent on purpose in this case.

Regarding Siek & Taha: Since Erlang/BEAM is a dynamic language, there are already runtime checks for everything anyway. The runtime catches it and raises an exception such as badarg. The paper speaks about the case where a dynamic type is added to a statically language without runtime type checks. This is my interpretation anyway.

Could we check for unique module filenames in another way then? Just some simple shell script?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since Erlang/BEAM is a dynamic language, there are already runtime checks for everything anyway.

I agree. However, I think the mentioned example shows unsoundness of the type system. Let's consider this (and the following) paragraph from https://papl.cs.brown.edu/2014/safety-soundness.html#%28part._type-soundness%29:

The central result we wish to have for a given type-system is called soundness. It says this. Suppose we are given an expression (or program) e. We type-check it and conclude that its type is t. When we run e, let us say we obtain the value v. Then v will also have type t.

In our case the type checker will pretend that f1() -> boolean(), but in fact it turns out to be f1() -> integer() upon evaluation.

Could we check for unique module filenames in another way then? Just some simple shell script?

I think we can use some find | sort | uniq magic to do it. I'll come up with something tomorrow.

Copy link
Collaborator

@zuiderkwast zuiderkwast Jul 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying Gradualizer (or any sound gradual type system) should insert an assertion to make sure g1() in your example above returns a boolean? This is to ensure we get a runtime type error rather than a value of the wrong type...? So f1 above is translated into something like this:

-spec f1() -> boolean(). 
f1() -> 
    true andalso (begin G1 = g1(), ?assert(is_boolean(G1)), G1 end).

This is interesting. I didn't think about it in this way before.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is explained in a simple way in the introduction of Max S. New, Dustin Jamner, and Amal Ahmed. 2020. Graduality and Parametricity: Together Again for the First
Time. Proc. ACM Program. Lang. 4, POPL, Article 46 (January 2020), 32 pages. https://doi.org/10.1145/3371114

image

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying Gradualizer (or any sound gradual type system) should insert an assertion to make sure g1() in your example above returns a boolean?

I think so. The fragment you quote is aligned with what I remembered from the paper 👍

-spec f1() -> boolean(). 
f1() -> 
    true andalso (begin G1 = g1(), ?assert(is_boolean(G1)), G1 end).

So f1 above is translated into something like this [...]

Yes, more or less. Their paper didn't cover Erlang, obviously, and the semantics of the language start to matter at this point. I'd type the Erlang andalso as andalso(boolean(), any()) -> any() based on the below:

1> 3 andalso 5.
** exception error: bad argument: 3
2> false andalso 5.
false
3> true andalso 5.
5

So I think the inserted assertion, due to the spec we declare for f1, should enclose the application of andalso:

-spec f1() -> boolean(). 
f1() ->
  R = true andalso g1(),
  ?assert(is_boolean(R)),
  R.

However, I don't think we can consider this a goal for Gradualizer given it's not part of a compiler. Perhaps one day it could rewrite .beam files... but that's a far fetched goal. I think there are more tangible and important milestones before that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@josefs What do you think about our conclusions above? The example functions f1 and g1...

However, I don't think we can consider this a goal for Gradualizer given it's not part of a compiler. Perhaps one day it could rewrite .beam files... but that's a far fetched goal.

True, but without this I don't believe we have soundness and the types can't really be trusted. I always regarded gradualizer an experiment. The next experiment could be "Gradual Erlang", a language which looks like Erlang and compiles to Erlang with assertions inserted....

Copy link
Collaborator Author

@erszcz erszcz Aug 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting comment that A andalso B actually is syntactic sugar for case A of true -> B ... - erlang/otp#5456 (comment).

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you noted @zuiderkwast, inserting checks when the typechecker coerces to/from any() is referred to as "sound gradual typing" in the literature. My intention with Gradualizer was to refrain from doing it, simply because I thought that it would be too much work. Adding coercions has the advantage that runtime type errors would be reported earlier and in places that are often easier to understand.

In some implementations is required to insert these coercions because not doing so would lead to a crash/segfault/undefined behaviour. But since we're relying on executing things on top of BEAM we don't have such problems.

There are quite a few papers on measuring the cost of these coercions and there are example programs for which they are quite expensive. Those examples typically involve having any() embedded in some structure so that the coercions have to traverse the structure.

I don't have anything in principle against adding these kinds of coercions. But from an efficiency point of view it might be preferrable to have this not be the default behaviour and have a flag that enables it.

@erszcz
Copy link
Collaborator Author

erszcz commented Jul 14, 2022

Ok, we have a check for name clashes in place. Unless you have some comments, @zuiderkwast, I'd be happy to merge this.

Copy link
Collaborator

@zuiderkwast zuiderkwast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, looks good!

@erszcz erszcz merged commit 8d7fbf4 into josefs:master Jul 14, 2022
@erszcz erszcz deleted the dialyzer-cross-check branch July 14, 2022 15:28
@erszcz
Copy link
Collaborator Author

erszcz commented Jul 14, 2022

FYI, https://gist.github.com/erszcz/4d43a77464c87a514e71eecf2811af63#some-examples has summary of the discrepancies between Dialyzer and Gradualizer in should_pass tests. Two of them are already fixed: one here, one in #430. One or two of them need creating bug tickets, which I'll do at some point.

I'm keen on learning about the differences in should_fail tests, though. Dialyzer seems to be way more permissive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants