Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement tableswitch and flexible control flow #153

Closed
wants to merge 13 commits into from
40 changes: 15 additions & 25 deletions ml-proto/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,24 +82,11 @@ Note however that the REPL currently is too dumb to allow multi-line input. :)
See `wasm -h` for (the few) options.


## Language

For most part, the language understood by the interpreter is based on Ben's V8 prototype, but I took the liberty to try out a few simplifications and generalisations:

* *Expression Language.* There is no distinction between statements and expressions, everything is an expression. Some have an empty return type. Consequently, there is no need for a comma operator or ternary operator.

* *Simple Loops*. Like in Ben's prototype, there is only one sort of loop, the infinite one, which can only be terminated by an explicit `break`. In such a language, a `continue` statement actually is completely redundant, because it equivalent to a `break` to a label on the loop's *body*. So I dropped `continue`.

* *Break with Arguments.* In the spirit of a true expression language, `break` can carry arguments, which then become the result of the labelled expression it cuts to.

* *Switch with Explicit Fallthru*. By default, a switch arm is well-behaved in that it does *not* fall through to the next case. However, it can be marked as fallthru explicitly.


## Core Language vs External Language

The implementation tries to separate the concern of what is the language (and its semantics) from what is its external encoding. In that spirit, the actual AST is regular and minimal, while certain abbreviations are considered "syntactic sugar" of an external representation optimised for compactness.

For example, `if` always has an else-branch in the AST, but in the external format an else-less conditional is allowed as an abbreviation for one with `nop`. Similarly, blocks can sometimes be left implicit in sub-expressions. Furthermore, fallthru is a flag on each `switch` arm in the AST, but an explicit "opcode" in the external form.
For example, `if` always has an else-branch in the AST, but in the external format an else-less conditional is allowed as an abbreviation for one with `nop`. Similarly, blocks can sometimes be left implicit in sub-expressions.

Here, the external format is S-expressions, but similar considerations would apply to a binary encoding. That is, there would be codes for certain abbreviations, but these are just a matter of the encoding.

Expand Down Expand Up @@ -132,18 +119,19 @@ expr:
( nop )
( block <expr>+ )
( block <var> <expr>+ ) ;; = (label <var> (block <expr>+))
( if <expr> <expr> <expr> )
( if <expr> <expr> ) ;; = (if <expr> <expr> (nop))
( loop <expr>* ) ;; = (loop (block <expr>*))
( loop <var> <var>? <expr>* ) ;; = (label <var> (loop (block <var>? <expr>*)))
( if_else <expr> <expr> <expr> )
( if <expr> <expr> ) ;; = (if_else <expr> <expr> (nop))
( br_if <expr> <var> ) ;; = (if_else <expr> (br <var>) (nop))
( loop <var>? <expr>* ) ;; = (loop <var>? (block <expr>*))
( loop <var> <var> <expr>* ) ;; = (label <var> (loop <var> (block <expr>*)))
( label <var>? <expr> )
( break <var> <expr>? )
( <type>.switch <expr> <case>* <expr> )
( <type>.switch <var> <expr> <case>* <expr> ) ;; = (label <var> (<type>.switch <expr> <case>* <expr>))
( br <var> <expr>? )
( return <expr>? ) ;; = (br <current_depth> <expr>?)
( tableswitch <expr> <case>+ )
( tableswitch <var> <expr> <case>+ ) ;; = (label <var> (tableswitch <expr> <case>+))
( call <var> <expr>* )
( call_import <var> <expr>* )
( call_indirect <var> <expr> <expr>* )
( return <expr>? ) ;; = (break <current_depth> <expr>?)
( get_local <var> )
( set_local <var> <expr> )
( <type>.load((8|16)_<sign>)? <offset>? <align>? <expr> )
Expand All @@ -158,8 +146,10 @@ expr:
( grow_memory <expr> )

case:
( case <value> <expr>* fallthrough? ) ;; = (case <int> (block <expr>*) fallthrough?)
( case <value> ) ;; = (case <int> (nop) fallthrough)
( case <int> <expr>* ) ;; = (case <int> (block <expr>*))
( case_br <int> <var> ) ;; = (case <int> (br <var>))
( default <expr>* ) ;; = (default (block <expr>*))
( default_br <var> ) ;; = (default (br <var>))

func: ( func <name>? <type>? <param>* <result>? <local>* <expr>* )
type: ( type <var> )
Expand Down Expand Up @@ -225,7 +215,7 @@ The implementation consists of the following parts:

* *Validator* (`check.ml[i]`). Does a recursive walk of the AST, passing down the *expected* type for expressions (or rather, a list thereof, because of multi-values), and checking each expression against that. An expected empty list of types can be matched by any result, corresponding to implicit dropping of unused values (e.g. in a block).

* *Evaluator* (`eval.ml[i]`, `values.ml`, `arithmetic.ml[i]`, `memory.ml[i]`). Evaluation of control transfer (`break` and `return`) is implemented using local exceptions as "labels". While these are allocated dynamically in the code and addressed via a stack, that is merely to simplify the code. In reality, these would be static jumps.
* *Evaluator* (`eval.ml[i]`, `values.ml`, `arithmetic.ml[i]`, `memory.ml[i]`). Evaluation of control transfer (`br` and `return`) is implemented using local exceptions as "labels". While these are allocated dynamically in the code and addressed via a stack, that is merely to simplify the code. In reality, these would be static jumps.

* *Driver* (`main.ml`, `script.ml[i]`, `error.ml`, `print.ml[i]`, `flags.ml`). Executes scripts, reports results or errors, etc.

Expand Down
7 changes: 7 additions & 0 deletions ml-proto/given/lib.ml
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,13 @@ struct
let app f = function
| Some x -> f x
| None -> ()

let compare cmp_a o1 o2 =
match o1, o2 with
| None, None -> 0
| None, Some _ -> -1
| Some _, None -> 1
| Some x1, Some x2 -> cmp_a x1 x2
end

module Int =
Expand Down
1 change: 1 addition & 0 deletions ml-proto/given/lib.mli
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ sig
val get : 'a option -> 'a -> 'a
val map : ('a -> 'b) -> 'a option -> 'b option
val app : ('a -> unit) -> 'a option -> unit
val compare : ('a -> 'a -> int) -> 'a option -> 'a option -> int
end

module Int :
Expand Down
11 changes: 8 additions & 3 deletions ml-proto/host/lexer.mll
Original file line number Diff line number Diff line change
Expand Up @@ -134,16 +134,22 @@ rule token = parse
| "nop" { NOP }
| "block" { BLOCK }
| "if" { IF }
| "if_else" { IF_ELSE }
| "loop" { LOOP }
| "label" { LABEL }
| "break" { BREAK }
| "br" { BR }
| "tableswitch" { TABLESWITCH }
| "case" { CASE }
| "fallthrough" { FALLTHROUGH }
| "default" { DEFAULT }
| "call" { CALL }
| "call_import" { CALL_IMPORT }
| "call_indirect" { CALL_INDIRECT }
| "return" { RETURN }

| "br_if" { BR_IF }
| "case_br" { CASE_BR }
| "default_br" { DEFAULT_BR }

| "get_local" { GET_LOCAL }
| "set_local" { SET_LOCAL }

Expand All @@ -158,7 +164,6 @@ rule token = parse
| "offset="(digits as s) { OFFSET (Int64.of_string s) }
| "align="(digits as s) { ALIGN (int_of_string s) }

| (nxx as t)".switch" { SWITCH (value_type t) }
| (nxx as t)".const" { CONST (value_type t) }

| (ixx as t)".clz" { UNARY (intop t Int32Op.Clz Int64Op.Clz) }
Expand Down
55 changes: 33 additions & 22 deletions ml-proto/host/parser.mly
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,11 @@ let literal s t =
| Failure msg -> error s.at ("constant out of range: " ^ msg)
| _ -> error s.at "constant out of range"

let int32 s =
try I32.of_string s.it with
| Failure reason -> error s.at ("constant out of range: " ^ reason)
| _ -> error s.at "constant out of range"


(* Memory operands *)

Expand Down Expand Up @@ -153,7 +158,8 @@ let implicit_decl c t at =
%}

%token INT FLOAT TEXT VAR VALUE_TYPE LPAR RPAR
%token NOP BLOCK IF LOOP LABEL BREAK SWITCH CASE FALLTHROUGH
%token NOP BLOCK IF IF_ELSE LOOP LABEL BR TABLESWITCH CASE DEFAULT
%token BR_IF CASE_BR DEFAULT_BR
%token CALL CALL_IMPORT CALL_INDIRECT RETURN
%token GET_LOCAL SET_LOCAL LOAD STORE LOAD_EXTEND STORE_WRAP OFFSET ALIGN
%token CONST UNARY BINARY COMPARE CONVERT
Expand All @@ -169,7 +175,6 @@ let implicit_decl c t at =
%token<string> VAR
%token<Types.value_type> VALUE_TYPE
%token<Types.value_type> CONST
%token<Types.value_type> SWITCH
%token<Ast.unop> UNARY
%token<Ast.binop> BINARY
%token<Ast.relop> COMPARE
Expand Down Expand Up @@ -247,21 +252,23 @@ expr1 :
| NOP { fun c -> nop }
| BLOCK labeling expr expr_list
{ fun c -> let c', l = $2 c in block (l, $3 c' :: $4 c') }
| IF expr expr expr_opt { fun c -> if_ ($2 c, $3 c, $4 c) }
| IF_ELSE expr expr expr { fun c -> if_else ($2 c, $3 c, $4 c) }
| IF expr expr { fun c -> if_ ($2 c, $3 c) }
| BR_IF expr var { fun c -> br_if ($2 c, $3 c label) }
| LOOP labeling labeling expr_list
{ fun c -> let c', l1 = $2 c in let c'', l2 = $3 c' in
loop (l1, l2, $4 c'') }
let c''' = if l1.it = Unlabelled then anon_label c'' else c'' in
loop (l1, l2, $4 c''') }
| LABEL labeling expr
{ fun c -> let c', l = $2 c in
let c'' = if l.it = Unlabelled then anon_label c' else c' in
Sugar.label ($3 c'') }
| BREAK var expr_opt { fun c -> break ($2 c label, $3 c) }
| BR var expr_opt { fun c -> br ($2 c label, $3 c) }
| RETURN expr_opt
{ let at1 = ati 1 in
fun c -> return (label c ("return" @@ at1) @@ at1, $2 c) }
| SWITCH labeling expr cases
{ fun c -> let c', l = $2 c in let cs, e = $4 c' in
switch (l, $1, $3 c', List.map (fun a -> a $1) cs, e) }
| TABLESWITCH labeling expr case_list
{ fun c -> let c', l = $2 c in tableswitch (l, $3 c', $4 c') }
| CALL var expr_list { fun c -> call ($2 c func, $3 c) }
| CALL_IMPORT var expr_list { fun c -> call_import ($2 c import, $3 c) }
| CALL_INDIRECT var expr expr_list
Expand Down Expand Up @@ -295,23 +302,27 @@ expr_list :
| expr expr_list { fun c -> $1 c :: $2 c }
;

fallthrough :
| /* empty */ { false }
| FALLTHROUGH { true }
;

case :
| LPAR case1 RPAR { let at = at () in fun c t -> $2 c t @@ at }
| LPAR case1 RPAR { let at = at () in fun c -> $2 c @@ at }
;
casebr :
| LPAR casebr1 RPAR { let at = at () in fun c -> $2 c @@ at }
;
casebr1 :
| CASE_BR literal var { fun c -> case_br (int32 $2, $3 c label) }
| DEFAULT_BR var { fun c -> default_br ($2 c label) }
;
case1 :
| CASE literal expr expr_list fallthrough
{ fun c t -> case (literal $2 t, Some ($3 c :: $4 c, $5)) }
| CASE literal
{ fun c t -> case (literal $2 t, None) }
;
cases :
| expr { fun c -> [], $1 c }
| case cases { fun c -> let x, y = $2 c in $1 c :: x, y }
| CASE literal expr_list { fun c -> case (int32 $2, $3 c) }
| DEFAULT expr_list { fun c -> default ($2 c) }
;
case_list :
| case_list1 { $1 }
| casebr case_list { fun c -> $1 c :: $2 c }
;
case_list1 :
| /* empty */ { fun c -> [] }
| case case_list1 { fun c -> $1 c :: $2 c }
;


Expand Down
47 changes: 23 additions & 24 deletions ml-proto/spec/ast.ml
Original file line number Diff line number Diff line change
Expand Up @@ -79,35 +79,34 @@ type literal = value Source.phrase

type expr = expr' Source.phrase
and expr' =
| Nop (* do nothing *)
| Block of expr list (* execute in sequence *)
| If of expr * expr * expr (* conditional *)
| Loop of expr (* infinite loop *)
| Label of expr (* labelled expression *)
| Break of var * expr option (* break to n-th surrounding label *)
| Switch of value_type * expr * case list * expr (* switch, latter expr is default *)
| Call of var * expr list (* call function *)
| CallImport of var * expr list (* call imported function *)
| CallIndirect of var * expr * expr list (* call function through table *)
| GetLocal of var (* read local variable *)
| SetLocal of var * expr (* write local variable *)
| Load of memop * expr (* read memory at address *)
| Store of memop * expr * expr (* write memory at address *)
| LoadExtend of extop * expr (* read memory at address and extend *)
| StoreWrap of wrapop * expr * expr (* wrap and write to memory at address *)
| Const of literal (* constant *)
| Unary of unop * expr (* unary arithmetic operator *)
| Binary of binop * expr * expr (* binary arithmetic operator *)
| Compare of relop * expr * expr (* arithmetic comparison *)
| Convert of cvt * expr (* conversion *)
| Host of hostop * expr list (* host interaction *)
| Nop (* do nothing *)
| Block of expr list (* execute in sequence *)
| If of expr * expr * expr (* conditional *)
| Loop of expr (* infinite loop *)
| Label of expr (* labelled expression *)
| Branch of var * expr option (* branch to n-th surrounding label *)
| Switch of expr * case list (* table switch *)
| Call of var * expr list (* call function *)
| CallImport of var * expr list (* call imported function *)
| CallIndirect of var * expr * expr list (* call function through table *)
| GetLocal of var (* read local variable *)
| SetLocal of var * expr (* write local variable *)
| Load of memop * expr (* read memory at address *)
| Store of memop * expr * expr (* write memory at address *)
| LoadExtend of extop * expr (* read memory at address and extend *)
| StoreWrap of wrapop * expr * expr (* wrap and write to memory at address *)
| Const of literal (* constant *)
| Unary of unop * expr (* unary arithmetic operator *)
| Binary of binop * expr * expr (* binary arithmetic operator *)
| Compare of relop * expr * expr (* arithmetic comparison *)
| Convert of cvt * expr (* conversion *)
| Host of hostop * expr list (* host interaction *)

and case = case' Source.phrase
and case' =
{
value : literal;
value : I32.t option;
expr : expr;
fallthru : bool
}


Expand Down
33 changes: 21 additions & 12 deletions ml-proto/spec/check.ml
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,12 @@ let import c x = lookup "import" c.imports x
let local c x = lookup "local" c.locals x
let label c x = lookup "label" c.labels x

module CaseSet = Set.Make(
struct
type t = I32.t option
let compare = Lib.Option.compare I32.compare_u
end)


(* Type comparison *)

Expand Down Expand Up @@ -132,21 +138,23 @@ let rec check_expr c et e =
check_expr c et e3

| Loop e1 ->
check_expr c None e1
let c' = {c with labels = None :: c.labels} in
check_expr c' et e1

| Label e1 ->
let c' = {c with labels = et :: c.labels} in
check_expr c' et e1

| Break (x, eo) ->
| Branch (x, eo) ->
check_expr_opt c (label c x) eo e.at

| Switch (t, e1, cs, e2) ->
require (t = Int32Type || t = Int64Type) e.at "invalid switch type";
(* TODO: Check that cases are unique. *)
check_expr c (Some t) e1;
List.iter (check_case c t et) cs;
check_expr c et e2
| Switch (e1, cs) ->
check_expr c (Some Int32Type) e1;
let cc, _ = List.fold_right (check_case c) cs (CaseSet.empty, et) in
require (CaseSet.mem None cc) e.at "switch is missing default case";
let max = CaseSet.max_elt cc in
require (max = None || max = Some (I32.of_int (CaseSet.cardinal cc - 2)))
e.at "switch is not dense"

| Call (x, es) ->
let {ins; out} = func c x in
Expand Down Expand Up @@ -231,10 +239,11 @@ and check_expr_opt c et eo at =
and check_literal c et l =
check_type (Some (type_value l.it)) et l.at

and check_case c t et case =
let {value = l; expr = e; fallthru} = case.it in
check_literal c (Some t) l;
check_expr c (if fallthru then None else et) e
and check_case c case (cc, et) =
let {value = l; expr = e} = case.it in
require (not (CaseSet.mem l cc)) case.at "duplicate case";
check_expr c et e;
(CaseSet.add l cc, None)

and check_load c et memop e1 at =
check_has_memory c at;
Expand Down
Loading