-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable Dialyzer and ETC cross checks #429
Conversation
1d159ea
to
21451ed
Compare
21451ed
to
748cbf8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! What's the results? Any dialyzer errors found?
.PHONY: dialyze-tests | ||
dialyze-tests: app $(DIALYZER_PLT) | ||
dialyzer $(DIALYZER_OPTS) $(test_data_erls) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we run this in CI to maintain expected results? I.e. to prevent name clashes from being introduced again.
Are there any dialyzer errors in the tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, we probably could, but the errors do not overlap completely, so we would need to come up with some should-pass/should-fail/known-problems mapping mechanism. I don't think we can run it just to detect name clashes, we would also have to check the error code and determine which is returned why.
Below is an interesting case:
Gradualizer/test/should_pass/andalso_any.erl
Lines 5 to 10 in 51a8029
-spec f1() -> boolean(). | |
f1() -> | |
true andalso g1(). | |
-spec g1() -> any(). | |
g1() -> 3. |
Dialyzer reports:
andalso_any.erl:5:2: Invalid type specification for function andalso_any:f1/0. The success typing is
() -> 3
Whereas Gradualizer is fine with it, because any()
is compatible with every other type. AFAIU a full-blown gradual typing system according to Siek and Taha would inject a dynamic check in such a place.
Dialyzer infers the type of an integer literal, but it couldn't do so if the value was not a literal.
Gradualizer just believes the spec, even with --infer
, and checks against it, so it assumes everything is fine.
I imagine there are more cases like this, but haven't analysed all of them yet. The current summary is available at https://gist.github.com/erszcz/4d43a77464c87a514e71eecf2811af63#file-check-2022-07-13_170833-tsv.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Yes, dializer does a full inference, while gradualizer is silent on purpose in this case.
Regarding Siek & Taha: Since Erlang/BEAM is a dynamic language, there are already runtime checks for everything anyway. The runtime catches it and raises an exception such as badarg. The paper speaks about the case where a dynamic type is added to a statically language without runtime type checks. This is my interpretation anyway.
Could we check for unique module filenames in another way then? Just some simple shell script?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since Erlang/BEAM is a dynamic language, there are already runtime checks for everything anyway.
I agree. However, I think the mentioned example shows unsoundness of the type system. Let's consider this (and the following) paragraph from https://papl.cs.brown.edu/2014/safety-soundness.html#%28part._type-soundness%29:
The central result we wish to have for a given type-system is called soundness. It says this. Suppose we are given an expression (or program) e. We type-check it and conclude that its type is t. When we run e, let us say we obtain the value v. Then v will also have type t.
In our case the type checker will pretend that f1() -> boolean()
, but in fact it turns out to be f1() -> integer()
upon evaluation.
Could we check for unique module filenames in another way then? Just some simple shell script?
I think we can use some find | sort | uniq
magic to do it. I'll come up with something tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you saying Gradualizer (or any sound gradual type system) should insert an assertion to make sure g1()
in your example above returns a boolean? This is to ensure we get a runtime type error rather than a value of the wrong type...? So f1 above is translated into something like this:
-spec f1() -> boolean().
f1() ->
true andalso (begin G1 = g1(), ?assert(is_boolean(G1)), G1 end).
This is interesting. I didn't think about it in this way before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is explained in a simple way in the introduction of Max S. New, Dustin Jamner, and Amal Ahmed. 2020. Graduality and Parametricity: Together Again for the First
Time. Proc. ACM Program. Lang. 4, POPL, Article 46 (January 2020), 32 pages. https://doi.org/10.1145/3371114
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you saying Gradualizer (or any sound gradual type system) should insert an assertion to make sure
g1()
in your example above returns a boolean?
I think so. The fragment you quote is aligned with what I remembered from the paper 👍
-spec f1() -> boolean().
f1() ->
true andalso (begin G1 = g1(), ?assert(is_boolean(G1)), G1 end).
So f1 above is translated into something like this [...]
Yes, more or less. Their paper didn't cover Erlang, obviously, and the semantics of the language start to matter at this point. I'd type the Erlang andalso
as andalso(boolean(), any()) -> any()
based on the below:
1> 3 andalso 5.
** exception error: bad argument: 3
2> false andalso 5.
false
3> true andalso 5.
5
So I think the inserted assertion, due to the spec we declare for f1
, should enclose the application of andalso
:
-spec f1() -> boolean().
f1() ->
R = true andalso g1(),
?assert(is_boolean(R)),
R.
However, I don't think we can consider this a goal for Gradualizer given it's not part of a compiler. Perhaps one day it could rewrite .beam
files... but that's a far fetched goal. I think there are more tangible and important milestones before that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@josefs What do you think about our conclusions above? The example functions f1
and g1
...
However, I don't think we can consider this a goal for Gradualizer given it's not part of a compiler. Perhaps one day it could rewrite .beam files... but that's a far fetched goal.
True, but without this I don't believe we have soundness and the types can't really be trusted. I always regarded gradualizer an experiment. The next experiment could be "Gradual Erlang", a language which looks like Erlang and compiles to Erlang with assertions inserted....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting comment that A andalso B
actually is syntactic sugar for case A of true -> B ...
- erlang/otp#5456 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you noted @zuiderkwast, inserting checks when the typechecker coerces to/from any()
is referred to as "sound gradual typing" in the literature. My intention with Gradualizer was to refrain from doing it, simply because I thought that it would be too much work. Adding coercions has the advantage that runtime type errors would be reported earlier and in places that are often easier to understand.
In some implementations is required to insert these coercions because not doing so would lead to a crash/segfault/undefined behaviour. But since we're relying on executing things on top of BEAM we don't have such problems.
There are quite a few papers on measuring the cost of these coercions and there are example programs for which they are quite expensive. Those examples typically involve having any()
embedded in some structure so that the coercions have to traverse the structure.
I don't have anything in principle against adding these kinds of coercions. But from an efficiency point of view it might be preferrable to have this not be the default behaviour and have a flag that enables it.
1572dbb
to
82e3ea9
Compare
Ok, we have a check for name clashes in place. Unless you have some comments, @zuiderkwast, I'd be happy to merge this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, looks good!
FYI, https://gist.github.com/erszcz/4d43a77464c87a514e71eecf2811af63#some-examples has summary of the discrepancies between Dialyzer and Gradualizer in I'm keen on learning about the differences in |
This adds a few changes to facilitate comparisons between existing Erlang type checkers:
should_pass
/should_fail
/ ...*.erltypes
files generated by ETCAll in all, there are no functional changes to the type checker itself.
Intermediate comparison results are available at https://gist.github.com/erszcz/4d43a77464c87a514e71eecf2811af63.