Skip to content

JamesHertz/mini-ml

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ICL-Project

In this repository you will find an haskell's implementation of the portion of the micro-ml programming language proposed for the first part of the pratical assignment of ICL. You will find a typechecker, a parser, an interpreter and a compiler for the language.

Due to circumstances we've decided to write the report here in this README file.

Build & Run

To build you need to have ghc (haskell compiler), gnu make and a version of java installed (for the compiler). After making sure you have both installed you can just run make, at the root of the repo, and it will drop out of some box an executable which full name from the root the project will be bin/main.

Its one executable and not two. The professor can run just the executable providing no arguments and it will display what it expects to receive. But to sum it up if you run bin/main -i <filename> it will run the interpreter and print the results. If you run bin/main -c <filename> (this one really needs to executed in the root of the project as shown here) it will generate a jar file with the same base name as <filename> but ending with .jar. Which means if you run bin/main -c path/to/file.mml it will generate a file name file.jar on the current directory. (To run the jar file use java -jar <generated-jar-filename>). You can also run a repl but using bin/main repl and a life editor (which unfortunately only interprets and doesn't compile) by using bin/main editor and it will allow you to write code and when you hit CTRL+D it will run the interpreter of the text you provide.

Note: The compiler will use the jar file stored in the bin/ folder to run jasmine that one of the reasons you will need to run the executable from the root of the project when using the compiler. It will also compile the java files in the folder stdlib/ and add to the final jar.

When compiling one can also see the code jasmine assembler generated by the compiler by providing an environment variable named DEBUG, for example DEBUG=1 bin/main -c path/to/file will print on the terminal the code for each jasmine file generated.

Syntax & Ast

Here below we list the context-free grammar of our parser:

<program>    ::=  <decl> EOF
<decl>       ::=  <sequence> | <letBlock> | <whileExpr> | <ifExpr> | <funDecl>
<sequence>   ::= <assigment> (";" <sequence>)*
<assigment>  ::= <expr> (":=" <assigment>)?
<expr>       ::= <logicalOr>  ( "&&" <logicalOr> )*
<logicalOr>  ::= <comparison> ( "||" <comparison>)*
<comparison> ::= <term>  (( ">" | "<" | "==" | "!=" | ">=" | "<=" ) <term> )*
<term>       ::= <factor> (( "+" | "-" ) <term>  )*
<factor>     ::= <primary> (( "*" | "/" ) <factor> )*
<unary>      ::= ("-"|"~"|"!"|"new"|"print" | "println") <unary> | <call>
<call>       ::= <primary> (<primary>)*
<primary>    ::= "true" | "false" | Num | "()" | "(" <decl> ")" | ID 
                  
<letBlock>   ::= "let" ( Id (":"<type>)? "=" <decl> ";;" )+ "in" <decl> "end" 
<funDecl>    ::= "fun" <func-pars> "->" <decl> "end"
<ifExpr>     ::= "if" <expr> "then" <decl> ("else" <decl>)? "end"
<whileExpr>  ::= "while" <expr> "do" <decl> "end"
<func-pars>  ::= "()" | ( "(" Id ":" <type> ")" | Id )+

<type>       ::=  "int" | "bool" | "unit" | "ref" <type> | <type> "->" <type>

A few things are worth noting:

  • You will need to provide ;; at the end of each variable attribution this is due to what is explained in the next bullet point.
  • You don't need parenthesis neither commas to do function calls. Its much like you do in Haskell which means you can't do f -1 since function calls have the highest priority, it has to be f (-1) etc...
  • Function declaration is pretty much like you do in Ocaml
  • There no actual way to on the function declaration specify what it returns unless if you : after the variable name (that will hold the function)
  • Also as you might already see in the type definition (which is pretty much what the professor required) it doesn't make difference of types like int -> int -> bool -> bool and int -> (int -> bool) -> bool, which might seen to be the same but aren't since the latter expects a function as the 2nd argument which the first an integer. This is just a parser inconvenience, one can always do fun (x : int) (f : int -> bool) (b : bool) -> ... end and it will work as expected.

Ast

About the AST, we tried to be clever and spare as much as possible, not to endup with tons of nodes so the full extend of it might not be easy to understand. But in the beginning of of src/Types.hs you will find a list of all the official nodes. I don't know if you will even peek at the code, but I left this message just in case.

Extra features

As extra features we've done:

  • Type inference and polymorphism
  • Partial application of functions
  • We have some tests on the folder tests/ (the ones that have error on their name should fail on the type checker with the sole exception of tests/func-partial-hard.mml - explained below)
  • Our error messages show the line where it occur (it also shows the position but its not great since we started to do unifications ... One can be sure that the error is on the token that starts at where it points but not exactly at the position is pointing when it is reporting a typing error)

Regarding the partial application, it works as one would expect but due to technical problems and lack of time it doesn't support polymorphism. One example that illustrates that is the one found in the file tests/func-partial-hard.mml file:

let 
  print2 = fun x y -> println x ; println y end ;;
  print1 = print2 10 ;;
in
  print1 () ;
  print1 true
end

One would expect that it would work, but the truth is that it won't since the type of print1 becomes int -> unit -> unit after the line print1 (). This is due to let-polymorphism technique, that in this case would need to see which variables should still be free after doing print2 10. I don't know if was supposed to work or not, I didn't manage to get enough time to think about it.

Project organization

Fortunately or unfortunately all the program files are in the src/ dir which might confuse one that tries to see how the usage of each one. I thought in using subfolders to but delayed it and in the end couldn't do it due to time. But if I was to create a folder it would look something like this:

.
└── src
    ├── compiler
    │   ├── Compiler.hs
    │   └── Serializer.hs
    ├── core
    │   ├── Core.hs
    │   └── Types.hs
    ├── error
    │   └── Errors.hs
    ├── frontend
    │   ├── Parser.hs
    │   └── Scanner.hs
    ├── interpreter
    │   └── Interpreter.hs
    ├── Main.hs
    └── typecheck
        ├── TypeChecker.hs
        ├── Unification.hs
        └── UnionFind.hs

Hopefully this gives some insight into how everything is organized and what does what.

Standard Library & Decision

As the professor can see there is a folder named stdlib/ which contains some java classes and functions that is used on the run time by our compiler. For things like partial application, references and functions definition. They are included in the final jar that will also the rest of the program.

Polymorphism

In order to support this, we needed to use a uniform representation of functions, and so we did. All our functions implements the interface found in stdlib/Func.java which basically has the method apply that has the following signature:

Object apply(Object[] args);

This means that we do wrap and unwrapping of values to and from Object in order to support polymorphism.

Partial Application

In order to support partial application we made a function that can be found on stdlib/Utils.java that basically receives an instance of stdlib/Func.java and an array of Object and does some trick. Its pretty simple java code, and when an partial application is needed that function is called.

References

We also wanted to have a single references so all the references are actually object of the class found in stdlib/Ref.java which has a field name value that has the type Object. So we also do wrap and unwrapping when storing and restoring what the reference store.

Unit Type

To represent a unit we used the class found in stdlib/Unit.java which has the static final field named SINGLE this is loaded to the stack wherever the language produces a unit type.

Type Checker Detail

One small detail I wanted to point out is that although a branch of the sum type that makes the AstNode named PartialApplication. This one is not produced by the compiler but, later when doing the substitutions of the types in the end of typecheck function to produced the final TypedAst when a call is made with less than the number of arguments expected, that one is turned into a ParatialApplication node. Furthermore there was made a small change on the unification algorithm part found in the unify function in src/Unification.hs file. Which basically when is unifying two functions type if one has more types/expects more types than the other, it will basically match up until the last of the smaller and then unify the result of the smallest with a function that is the rest that wasn't matched. If this explanation sounds weird, just go and check it out. The code is pretty simple.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published