From ffc5f1ccd8b0c7ca07414c324729ecac97f47e6a Mon Sep 17 00:00:00 2001 From: Ignacio Corderi Date: Wed, 26 Nov 2014 16:40:56 -0800 Subject: [PATCH 1/2] Added src/doc/grammar.md to hold Rust grammar --- src/doc/grammar.md | 1514 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 1514 insertions(+) create mode 100644 src/doc/grammar.md diff --git a/src/doc/grammar.md b/src/doc/grammar.md new file mode 100644 index 0000000000000..6b700210201d4 --- /dev/null +++ b/src/doc/grammar.md @@ -0,0 +1,1514 @@ +# **This is a work in progress** + +% The Rust Grammar + +# Introduction + +This document is the primary reference for the Rust programming language grammar. It +provides only one kind of material: + + - Chapters that formally define the language grammar and, for each + construct. + +This document does not serve as an introduction to the language. Background +familiarity with the language is assumed. A separate [guide] is available to +help acquire such background familiarity. + +This document also does not serve as a reference to the [standard] library +included in the language distribution. Those libraries are documented +separately by extracting documentation attributes from their source code. Many +of the features that one might expect to be language features are library +features in Rust, so what you're looking for may be there, not here. + +[guide]: guide.html +[standard]: std/index.html + +# Notation + +Rust's grammar is defined over Unicode codepoints, each conventionally denoted +`U+XXXX`, for 4 or more hexadecimal digits `X`. _Most_ of Rust's grammar is +confined to the ASCII range of Unicode, and is described in this document by a +dialect of Extended Backus-Naur Form (EBNF), specifically a dialect of EBNF +supported by common automated LL(k) parsing tools such as `llgen`, rather than +the dialect given in ISO 14977. The dialect can be defined self-referentially +as follows: + +```antlr +grammar : rule + ; +rule : nonterminal ':' productionrule ';' ; +productionrule : production [ '|' production ] * ; +production : term * ; +term : element repeats ; +element : LITERAL | IDENTIFIER | '[' productionrule ']' ; +repeats : [ '*' | '+' ] NUMBER ? | NUMBER ? | '?' ; +``` + +Where: + +- Whitespace in the grammar is ignored. +- Square brackets are used to group rules. +- `LITERAL` is a single printable ASCII character, or an escaped hexadecimal + ASCII code of the form `\xQQ`, in single quotes, denoting the corresponding + Unicode codepoint `U+00QQ`. +- `IDENTIFIER` is a nonempty string of ASCII letters and underscores. +- The `repeat` forms apply to the adjacent `element`, and are as follows: + - `?` means zero or one repetition + - `*` means zero or more repetitions + - `+` means one or more repetitions + - NUMBER trailing a repeat symbol gives a maximum repetition count + - NUMBER on its own gives an exact repetition count + +This EBNF dialect should hopefully be familiar to many readers. + +## Unicode productions + +A few productions in Rust's grammar permit Unicode codepoints outside the ASCII +range. We define these productions in terms of character properties specified +in the Unicode standard, rather than in terms of ASCII-range codepoints. The +section [Special Unicode Productions](#special-unicode-productions) lists these +productions. + +## String table productions + +Some rules in the grammar — notably [unary +operators](#unary-operator-expressions), [binary +operators](#binary-operator-expressions), and [keywords](#keywords) — are +given in a simplified form: as a listing of a table of unquoted, printable +whitespace-separated strings. These cases form a subset of the rules regarding +the [token](#tokens) rule, and are assumed to be the result of a +lexical-analysis phase feeding the parser, driven by a DFA, operating over the +disjunction of all such string table entries. + +When such a string enclosed in double-quotes (`"`) occurs inside the grammar, +it is an implicit reference to a single member of such a string table +production. See [tokens](#tokens) for more information. + +# Lexical structure + +## Input format + +Rust input is interpreted as a sequence of Unicode codepoints encoded in UTF-8. +Most Rust grammar rules are defined in terms of printable ASCII-range +codepoints, but a small number are defined in terms of Unicode properties or +explicit codepoint lists. [^inputformat] + +[^inputformat]: Substitute definitions for the special Unicode productions are + provided to the grammar verifier, restricted to ASCII range, when verifying the + grammar in this document. + +## Special Unicode Productions + +The following productions in the Rust grammar are defined in terms of Unicode +properties: `ident`, `non_null`, `non_star`, `non_eol`, `non_slash_or_star`, +`non_single_quote` and `non_double_quote`. + +### Identifiers + +The `ident` production is any nonempty Unicode string of the following form: + +- The first character has property `XID_start` +- The remaining characters have property `XID_continue` + +that does _not_ occur in the set of [keywords](#keywords). + +> **Note**: `XID_start` and `XID_continue` as character properties cover the +> character ranges used to form the more familiar C and Java language-family +> identifiers. + +### Delimiter-restricted productions + +Some productions are defined by exclusion of particular Unicode characters: + +- `non_null` is any single Unicode character aside from `U+0000` (null) +- `non_eol` is `non_null` restricted to exclude `U+000A` (`'\n'`) +- `non_star` is `non_null` restricted to exclude `U+002A` (`*`) +- `non_slash_or_star` is `non_null` restricted to exclude `U+002F` (`/`) and `U+002A` (`*`) +- `non_single_quote` is `non_null` restricted to exclude `U+0027` (`'`) +- `non_double_quote` is `non_null` restricted to exclude `U+0022` (`"`) + +## Comments + +```antlr +comment : block_comment | line_comment ; +block_comment : "/*" block_comment_body * "*/" ; +block_comment_body : [block_comment | character] * ; +line_comment : "//" non_eol * ; +``` + +**FIXME:** add doc grammar? + +## Whitespace + +```antlr +whitespace_char : '\x20' | '\x09' | '\x0a' | '\x0d' ; +whitespace : [ whitespace_char | comment ] + ; +``` + +## Tokens + +```antlr +simple_token : keyword | unop | binop ; +token : simple_token | ident | literal | symbol | whitespace token ; +``` + +### Keywords + +

+ +| | | | | | +|----------|----------|----------|----------|--------| +| abstract | alignof | as | be | box | +| break | const | continue | crate | do | +| else | enum | extern | false | final | +| fn | for | if | impl | in | +| let | loop | match | mod | move | +| mut | offsetof | once | override | priv | +| proc | pub | pure | ref | return | +| sizeof | static | self | struct | super | +| true | trait | type | typeof | unsafe | +| unsized | use | virtual | where | while | +| yield | | | | | + + +Each of these keywords has special meaning in its grammar, and all of them are +excluded from the `ident` rule. + +### Literals + +```antlr +lit_suffix : ident; +literal : [ string_lit | char_lit | byte_string_lit | byte_lit | num_lit ] lit_suffix ?; +``` + +#### Character and string literals + +```antlr +char_lit : '\x27' char_body '\x27' ; +string_lit : '"' string_body * '"' | 'r' raw_string ; + +char_body : non_single_quote + | '\x5c' [ '\x27' | common_escape | unicode_escape ] ; + +string_body : non_double_quote + | '\x5c' [ '\x22' | common_escape | unicode_escape ] ; +raw_string : '"' raw_string_body '"' | '#' raw_string '#' ; + +common_escape : '\x5c' + | 'n' | 'r' | 't' | '0' + | 'x' hex_digit 2 +unicode_escape : 'u' hex_digit 4 + | 'U' hex_digit 8 ; + +hex_digit : 'a' | 'b' | 'c' | 'd' | 'e' | 'f' + | 'A' | 'B' | 'C' | 'D' | 'E' | 'F' + | dec_digit ; +oct_digit : '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' ; +dec_digit : '0' | nonzero_dec ; +nonzero_dec: '1' | '2' | '3' | '4' + | '5' | '6' | '7' | '8' | '9' ; +``` + +#### Byte and byte string literals + +```antlr +byte_lit : "b\x27" byte_body '\x27' ; +byte_string_lit : "b\x22" string_body * '\x22' | "br" raw_byte_string ; + +byte_body : ascii_non_single_quote + | '\x5c' [ '\x27' | common_escape ] ; + +byte_string_body : ascii_non_double_quote + | '\x5c' [ '\x22' | common_escape ] ; +raw_byte_string : '"' raw_byte_string_body '"' | '#' raw_byte_string '#' ; + +``` + +#### Number literals + +```antlr +num_lit : nonzero_dec [ dec_digit | '_' ] * float_suffix ? + | '0' [ [ dec_digit | '_' ] * float_suffix ? + | 'b' [ '1' | '0' | '_' ] + + | 'o' [ oct_digit | '_' ] + + | 'x' [ hex_digit | '_' ] + ] ; + +float_suffix : [ exponent | '.' dec_lit exponent ? ] ? ; + +exponent : ['E' | 'e'] ['-' | '+' ] ? dec_lit ; +dec_lit : [ dec_digit | '_' ] + ; +``` + +#### Boolean literals + +**FIXME:** write grammar + +The two values of the boolean type are written `true` and `false`. + +### Symbols + +```antlr +symbol : "::" "->" + | '#' | '[' | ']' | '(' | ')' | '{' | '}' + | ',' | ';' ; +``` + +Symbols are a general class of printable [token](#tokens) that play structural +roles in a variety of grammar productions. They are catalogued here for +completeness as the set of remaining miscellaneous printable tokens that do not +otherwise appear as [unary operators](#unary-operator-expressions), [binary +operators](#binary-operator-expressions), or [keywords](#keywords). + +## Paths + +```antlr +expr_path : [ "::" ] ident [ "::" expr_path_tail ] + ; +expr_path_tail : '<' type_expr [ ',' type_expr ] + '>' + | expr_path ; + +type_path : ident [ type_path_tail ] + ; +type_path_tail : '<' type_expr [ ',' type_expr ] + '>' + | "::" type_path ; +``` + +# Syntax extensions + +## Macros + +```antlr +expr_macro_rules : "macro_rules" '!' ident '(' macro_rule * ')' ; +macro_rule : '(' matcher * ')' "=>" '(' transcriber * ')' ';' ; +matcher : '(' matcher * ')' | '[' matcher * ']' + | '{' matcher * '}' | '$' ident ':' ident + | '$' '(' matcher * ')' sep_token? [ '*' | '+' ] + | non_special_token ; +transcriber : '(' transcriber * ')' | '[' transcriber * ']' + | '{' transcriber * '}' | '$' ident + | '$' '(' transcriber * ')' sep_token? [ '*' | '+' ] + | non_special_token ; +``` + +# Crates and source files + +**FIXME:** grammar? What production covers #![crate_id = "foo"] ? + +# Items and attributes + +**FIXME:** grammar? + +## Items + +```antlr +item : mod_item | fn_item | type_item | struct_item | enum_item + | static_item | trait_item | impl_item | extern_block ; +``` + +### Type Parameters + +**FIXME:** grammar? + +### Modules + +```antlr +mod_item : "mod" ident ( ';' | '{' mod '}' ); +mod : [ view_item | item ] * ; +``` + +#### View items + +```antlr +view_item : extern_crate_decl | use_decl ; +``` + +##### Extern crate declarations + +```antlr +extern_crate_decl : "extern" "crate" crate_name +crate_name: ident | ( string_lit as ident ) +``` + +##### Use declarations + +```antlr +use_decl : "pub" ? "use" [ path "as" ident + | path_glob ] ; + +path_glob : ident [ "::" [ path_glob + | '*' ] ] ? + | '{' path_item [ ',' path_item ] * '}' ; + +path_item : ident | "mod" ; +``` + +### Functions + +**FIXME:** grammar? + +#### Generic functions + +**FIXME:** grammar? + +#### Unsafety + +**FIXME:** grammar? + +##### Unsafe functions + +**FIXME:** grammar? + +##### Unsafe blocks + +**FIXME:** grammar? + +#### Diverging functions + +**FIXME:** grammar? + +### Type definitions + +**FIXME:** grammar? + +### Structures + +**FIXME:** grammar? + +### Constant items + +```antlr +const_item : "const" ident ':' type '=' expr ';' ; +``` + +### Static items + +```antlr +static_item : "static" ident ':' type '=' expr ';' ; +``` + +#### Mutable statics + +**FIXME:** grammar? + +### Traits + +**FIXME:** grammar? + +### Implementations + +**FIXME:** grammar? + +### External blocks + +```antlr +extern_block_item : "extern" '{' extern_block '}' ; +extern_block : [ foreign_fn ] * ; +``` + +## Visibility and Privacy + +**FIXME:** grammar? + +### Re-exporting and Visibility + +**FIXME:** grammar? + +## Attributes + +```antlr +attribute : "#!" ? '[' meta_item ']' ; +meta_item : ident [ '=' literal + | '(' meta_seq ')' ] ? ; +meta_seq : meta_item [ ',' meta_seq ] ? ; +``` + +# Statements and expressions + +## Statements + +**FIXME:** grammar? + +### Declaration statements + +**FIXME:** grammar? + +A _declaration statement_ is one that introduces one or more *names* into the +enclosing statement block. The declared names may denote new slots or new +items. + +#### Item declarations + +**FIXME:** grammar? + +An _item declaration statement_ has a syntactic form identical to an +[item](#items) declaration within a module. Declaring an item — a +function, enumeration, structure, type, static, trait, implementation or module +— locally within a statement block is simply a way of restricting its +scope to a narrow region containing all of its uses; it is otherwise identical +in meaning to declaring the item outside the statement block. + +#### Slot declarations + +```antlr +let_decl : "let" pat [':' type ] ? [ init ] ? ';' ; +init : [ '=' ] expr ; +``` + +### Expression statements + +**FIXME:** grammar? + +## Expressions + +**FIXME:** grammar? + +#### Lvalues, rvalues and temporaries + +**FIXME:** grammar? + +#### Moved and copied types + +**FIXME:** Do we want to capture this in the grammar as different productions? + +### Literal expressions + +**FIXME:** grammar? + +### Path expressions + +**FIXME:** grammar? + +### Tuple expressions + +**FIXME:** grammar? + +### Unit expressions + +**FIXME:** grammar? + +### Structure expressions + +```antlr +struct_expr : expr_path '{' ident ':' expr + [ ',' ident ':' expr ] * + [ ".." expr ] '}' | + expr_path '(' expr + [ ',' expr ] * ')' | + expr_path ; +``` + +### Block expressions + +```antlr +block_expr : '{' [ view_item ] * + [ stmt ';' | item ] * + [ expr ] '}' ; +``` + +### Method-call expressions + +```antlr +method_call_expr : expr '.' ident paren_expr_list ; +``` + +### Field expressions + +```antlr +field_expr : expr '.' ident ; +``` + +### Array expressions + +```antlr +array_expr : '[' "mut" ? vec_elems? ']' ; + +array_elems : [expr [',' expr]*] | [expr ',' ".." expr] ; +``` + +### Index expressions + +```antlr +idx_expr : expr '[' expr ']' ; +``` + +### Unary operator expressions + +**FIXME:** grammar? + +### Binary operator expressions + +```antlr +binop_expr : expr binop expr ; +``` + +#### Arithmetic operators + +**FIXME:** grammar? + +#### Bitwise operators + +**FIXME:** grammar? + +#### Lazy boolean operators + +**FIXME:** grammar? + +#### Comparison operators + +**FIXME:** grammar? + +#### Type cast expressions + +**FIXME:** grammar? + +#### Assignment expressions + +**FIXME:** grammar? + +#### Compound assignment expressions + +**FIXME:** grammar? + +#### Operator precedence + +The precedence of Rust binary operators is ordered as follows, going from +strong to weak: + +``` +* / % +as ++ - +<< >> +& +^ +| +< > <= >= +== != +&& +|| += +``` + +Operators at the same precedence level are evaluated left-to-right. [Unary +operators](#unary-operator-expressions) have the same precedence level and it +is stronger than any of the binary operators'. + +### Grouped expressions + +```antlr +paren_expr : '(' expr ')' ; +``` + +### Call expressions + +```antlr +expr_list : [ expr [ ',' expr ]* ] ? ; +paren_expr_list : '(' expr_list ')' ; +call_expr : expr paren_expr_list ; +``` + +### Lambda expressions + +```antlr +ident_list : [ ident [ ',' ident ]* ] ? ; +lambda_expr : '|' ident_list '|' expr ; +``` + +### While loops + +```antlr +while_expr : "while" no_struct_literal_expr '{' block '}' ; +``` + +### Infinite loops + +```antlr +loop_expr : [ lifetime ':' ] "loop" '{' block '}'; +``` + +### Break expressions + +```antlr +break_expr : "break" [ lifetime ]; +``` + +### Continue expressions + +```antlr +continue_expr : "continue" [ lifetime ]; +``` + +### For expressions + +```antlr +for_expr : "for" pat "in" no_struct_literal_expr '{' block '}' ; +``` + +### If expressions + +```antlr +if_expr : "if" no_struct_literal_expr '{' block '}' + else_tail ? ; + +else_tail : "else" [ if_expr | if_let_expr + | '{' block '}' ] ; +``` + +### Match expressions + +```antlr +match_expr : "match" no_struct_literal_expr '{' match_arm * '}' ; + +match_arm : attribute * match_pat "=>" [ expr "," | '{' block '}' ] ; + +match_pat : pat [ '|' pat ] * [ "if" expr ] ? ; +``` + +### If let expressions + +```antlr +if_let_expr : "if" "let" pat '=' expr '{' block '}' + else_tail ? ; +else_tail : "else" [ if_expr | if_let_expr | '{' block '}' ] ; +``` + +### While let loops + +```antlr +while_let_expr : "while" "let" pat '=' expr '{' block '}' ; +``` + +### Return expressions + +```antlr +return_expr : "return" expr ? ; +``` + +# Type system + +## Types + +Every slot, item and value in a Rust program has a type. The _type_ of a +*value* defines the interpretation of the memory holding it. + +Built-in types and type-constructors are tightly integrated into the language, +in nontrivial ways that are not possible to emulate in user-defined types. +User-defined types have limited capabilities. + +### Primitive types + +The primitive types are the following: + +* The "unit" type `()`, having the single "unit" value `()` (occasionally called + "nil"). [^unittype] +* The boolean type `bool` with values `true` and `false`. +* The machine types. +* The machine-dependent integer and floating-point types. + +[^unittype]: The "unit" value `()` is *not* a sentinel "null pointer" value for + reference slots; the "unit" type is the implicit return type from functions + otherwise lacking a return type, and can be used in other contexts (such as + message-sending or type-parametric code) as a zero-size type.] + +#### Machine types + +The machine types are the following: + +* The unsigned word types `u8`, `u16`, `u32` and `u64`, with values drawn from + the integer intervals [0, 2^8 - 1], [0, 2^16 - 1], [0, 2^32 - 1] and + [0, 2^64 - 1] respectively. + +* The signed two's complement word types `i8`, `i16`, `i32` and `i64`, with + values drawn from the integer intervals [-(2^(7)), 2^7 - 1], + [-(2^(15)), 2^15 - 1], [-(2^(31)), 2^31 - 1], [-(2^(63)), 2^63 - 1] + respectively. + +* The IEEE 754-2008 `binary32` and `binary64` floating-point types: `f32` and + `f64`, respectively. + +#### Machine-dependent integer types + +The `uint` type is an unsigned integer type with the same number of bits as the +platform's pointer type. It can represent every memory address in the process. + +The `int` type is a signed integer type with the same number of bits as the +platform's pointer type. The theoretical upper bound on object and array size +is the maximum `int` value. This ensures that `int` can be used to calculate +differences between pointers into an object or array and can address every byte +within an object along with one byte past the end. + +### Textual types + +The types `char` and `str` hold textual data. + +A value of type `char` is a [Unicode scalar value]( +http://www.unicode.org/glossary/#unicode_scalar_value) (ie. a code point that +is not a surrogate), represented as a 32-bit unsigned word in the 0x0000 to +0xD7FF or 0xE000 to 0x10FFFF range. A `[char]` array is effectively an UCS-4 / +UTF-32 string. + +A value of type `str` is a Unicode string, represented as an array of 8-bit +unsigned bytes holding a sequence of UTF-8 codepoints. Since `str` is of +unknown size, it is not a _first class_ type, but can only be instantiated +through a pointer type, such as `&str` or `String`. + +### Tuple types + +A tuple *type* is a heterogeneous product of other types, called the *elements* +of the tuple. It has no nominal name and is instead structurally typed. + +Tuple types and values are denoted by listing the types or values of their +elements, respectively, in a parenthesized, comma-separated list. + +Because tuple elements don't have a name, they can only be accessed by +pattern-matching. + +The members of a tuple are laid out in memory contiguously, in order specified +by the tuple type. + +An example of a tuple type and its use: + +``` +type Pair<'a> = (int, &'a str); +let p: Pair<'static> = (10, "hello"); +let (a, b) = p; +assert!(b != "world"); +``` + +### Array, and Slice types + +Rust has two different types for a list of items: + +* `[T ..N]`, an 'array' +* `&[T]`, a 'slice'. + +An array has a fixed size, and can be allocated on either the stack or the +heap. + +A slice is a 'view' into an array. It doesn't own the data it points +to, it borrows it. + +An example of each kind: + +```{rust} +let vec: Vec = vec![1, 2, 3]; +let arr: [int, ..3] = [1, 2, 3]; +let s: &[int] = vec.as_slice(); +``` + +As you can see, the `vec!` macro allows you to create a `Vec` easily. The +`vec!` macro is also part of the standard library, rather than the language. + +All in-bounds elements of arrays, and slices are always initialized, and access +to an array or slice is always bounds-checked. + +### Structure types + +A `struct` *type* is a heterogeneous product of other types, called the +*fields* of the type.[^structtype] + +[^structtype]: `struct` types are analogous `struct` types in C, + the *record* types of the ML family, + or the *structure* types of the Lisp family. + +New instances of a `struct` can be constructed with a [struct +expression](#structure-expressions). + +The memory layout of a `struct` is undefined by default to allow for compiler +optimizations like field reordering, but it can be fixed with the +`#[repr(...)]` attribute. In either case, fields may be given in any order in +a corresponding struct *expression*; the resulting `struct` value will always +have the same memory layout. + +The fields of a `struct` may be qualified by [visibility +modifiers](#re-exporting-and-visibility), to allow access to data in a +structure outside a module. + +A _tuple struct_ type is just like a structure type, except that the fields are +anonymous. + +A _unit-like struct_ type is like a structure type, except that it has no +fields. The one value constructed by the associated [structure +expression](#structure-expressions) is the only value that inhabits such a +type. + +### Enumerated types + +An *enumerated type* is a nominal, heterogeneous disjoint union type, denoted +by the name of an [`enum` item](#enumerations). [^enumtype] + +[^enumtype]: The `enum` type is analogous to a `data` constructor declaration in + ML, or a *pick ADT* in Limbo. + +An [`enum` item](#enumerations) declares both the type and a number of *variant +constructors*, each of which is independently named and takes an optional tuple +of arguments. + +New instances of an `enum` can be constructed by calling one of the variant +constructors, in a [call expression](#call-expressions). + +Any `enum` value consumes as much memory as the largest variant constructor for +its corresponding `enum` type. + +Enum types cannot be denoted *structurally* as types, but must be denoted by +named reference to an [`enum` item](#enumerations). + +### Recursive types + +Nominal types — [enumerations](#enumerated-types) and +[structures](#structure-types) — may be recursive. That is, each `enum` +constructor or `struct` field may refer, directly or indirectly, to the +enclosing `enum` or `struct` type itself. Such recursion has restrictions: + +* Recursive types must include a nominal type in the recursion + (not mere [type definitions](#type-definitions), + or other structural types such as [arrays](#array,-and-slice-types) or [tuples](#tuple-types)). +* A recursive `enum` item must have at least one non-recursive constructor + (in order to give the recursion a basis case). +* The size of a recursive type must be finite; + in other words the recursive fields of the type must be [pointer types](#pointer-types). +* Recursive type definitions can cross module boundaries, but not module *visibility* boundaries, + or crate boundaries (in order to simplify the module system and type checker). + +An example of a *recursive* type and its use: + +``` +enum List { + Nil, + Cons(T, Box>) +} + +let a: List = List::Cons(7, box List::Cons(13, box List::Nil)); +``` + +### Pointer types + +All pointers in Rust are explicit first-class values. They can be copied, +stored into data structures, and returned from functions. There are two +varieties of pointer in Rust: + +* References (`&`) + : These point to memory _owned by some other value_. + A reference type is written `&type` for some lifetime-variable `f`, + or just `&'a type` when you need an explicit lifetime. + Copying a reference is a "shallow" operation: + it involves only copying the pointer itself. + Releasing a reference typically has no effect on the value it points to, + with the exception of temporary values, which are released when the last + reference to them is released. + +* Raw pointers (`*`) + : Raw pointers are pointers without safety or liveness guarantees. + Raw pointers are written as `*const T` or `*mut T`, + for example `*const int` means a raw pointer to an integer. + Copying or dropping a raw pointer has no effect on the lifecycle of any + other value. Dereferencing a raw pointer or converting it to any other + pointer type is an [`unsafe` operation](#unsafe-functions). + Raw pointers are generally discouraged in Rust code; + they exist to support interoperability with foreign code, + and writing performance-critical or low-level functions. + +The standard library contains additional 'smart pointer' types beyond references +and raw pointers. + +### Function types + +The function type constructor `fn` forms new function types. A function type +consists of a possibly-empty set of function-type modifiers (such as `unsafe` +or `extern`), a sequence of input types and an output type. + +An example of a `fn` type: + +``` +fn add(x: int, y: int) -> int { + return x + y; +} + +let mut x = add(5,7); + +type Binop<'a> = |int,int|: 'a -> int; +let bo: Binop = add; +x = bo(5,7); +``` + +### Closure types + +```{.ebnf .notation} +closure_type := [ 'unsafe' ] [ '<' lifetime-list '>' ] '|' arg-list '|' + [ ':' bound-list ] [ '->' type ] +procedure_type := 'proc' [ '<' lifetime-list '>' ] '(' arg-list ')' + [ ':' bound-list ] [ '->' type ] +lifetime-list := lifetime | lifetime ',' lifetime-list +arg-list := ident ':' type | ident ':' type ',' arg-list +bound-list := bound | bound '+' bound-list +bound := path | lifetime +``` + +The type of a closure mapping an input of type `A` to an output of type `B` is +`|A| -> B`. A closure with no arguments or return values has type `||`. +Similarly, a procedure mapping `A` to `B` is `proc(A) -> B` and a no-argument +and no-return value closure has type `proc()`. + +An example of creating and calling a closure: + +```rust +let captured_var = 10i; + +let closure_no_args = || println!("captured_var={}", captured_var); + +let closure_args = |arg: int| -> int { + println!("captured_var={}, arg={}", captured_var, arg); + arg // Note lack of semicolon after 'arg' +}; + +fn call_closure(c1: ||, c2: |int| -> int) { + c1(); + c2(2); +} + +call_closure(closure_no_args, closure_args); + +``` + +Unlike closures, procedures may only be invoked once, but own their +environment, and are allowed to move out of their environment. Procedures are +allocated on the heap (unlike closures). An example of creating and calling a +procedure: + +```rust +let string = "Hello".to_string(); + +// Creates a new procedure, passing it to the `spawn` function. +spawn(proc() { + println!("{} world!", string); +}); + +// the variable `string` has been moved into the previous procedure, so it is +// no longer usable. + + +// Create an invoke a procedure. Note that the procedure is *moved* when +// invoked, so it cannot be invoked again. +let f = proc(n: int) { n + 22 }; +println!("answer: {}", f(20)); + +``` + +### Object types + +Every trait item (see [traits](#traits)) defines a type with the same name as +the trait. This type is called the _object type_ of the trait. Object types +permit "late binding" of methods, dispatched using _virtual method tables_ +("vtables"). Whereas most calls to trait methods are "early bound" (statically +resolved) to specific implementations at compile time, a call to a method on an +object type is only resolved to a vtable entry at compile time. The actual +implementation for each vtable entry can vary on an object-by-object basis. + +Given a pointer-typed expression `E` of type `&T` or `Box`, where `T` +implements trait `R`, casting `E` to the corresponding pointer type `&R` or +`Box` results in a value of the _object type_ `R`. This result is +represented as a pair of pointers: the vtable pointer for the `T` +implementation of `R`, and the pointer value of `E`. + +An example of an object type: + +``` +trait Printable { + fn stringify(&self) -> String; +} + +impl Printable for int { + fn stringify(&self) -> String { self.to_string() } +} + +fn print(a: Box) { + println!("{}", a.stringify()); +} + +fn main() { + print(box 10i as Box); +} +``` + +In this example, the trait `Printable` occurs as an object type in both the +type signature of `print`, and the cast expression in `main`. + +### Type parameters + +Within the body of an item that has type parameter declarations, the names of +its type parameters are types: + +```ignore +fn map(f: |A| -> B, xs: &[A]) -> Vec { + if xs.len() == 0 { + return vec![]; + } + let first: B = f(xs[0].clone()); + let mut rest: Vec = map(f, xs.slice(1, xs.len())); + rest.insert(0, first); + return rest; +} +``` + +Here, `first` has type `B`, referring to `map`'s `B` type parameter; and `rest` +has type `Vec`, a vector type with element type `B`. + +### Self types + +The special type `self` has a meaning within methods inside an impl item. It +refers to the type of the implicit `self` argument. For example, in: + +``` +trait Printable { + fn make_string(&self) -> String; +} + +impl Printable for String { + fn make_string(&self) -> String { + (*self).clone() + } +} +``` + +`self` refers to the value of type `String` that is the receiver for a call to +the method `make_string`. + +## Type kinds + +Types in Rust are categorized into kinds, based on various properties of the +components of the type. The kinds are: + +* `Send` + : Types of this kind can be safely sent between tasks. + This kind includes scalars, boxes, procs, and + structural types containing only other owned types. + All `Send` types are `'static`. +* `Copy` + : Types of this kind consist of "Plain Old Data" + which can be copied by simply moving bits. + All values of this kind can be implicitly copied. + This kind includes scalars and immutable references, + as well as structural types containing other `Copy` types. +* `'static` + : Types of this kind do not contain any references (except for + references with the `static` lifetime, which are allowed). + This can be a useful guarantee for code + that breaks borrowing assumptions + using [`unsafe` operations](#unsafe-functions). +* `Drop` + : This is not strictly a kind, + but its presence interacts with kinds: + the `Drop` trait provides a single method `drop` + that takes no parameters, + and is run when values of the type are dropped. + Such a method is called a "destructor", + and are always executed in "top-down" order: + a value is completely destroyed + before any of the values it owns run their destructors. + Only `Send` types can implement `Drop`. + +* _Default_ + : Types with destructors, closure environments, + and various other _non-first-class_ types, + are not copyable at all. + Such types can usually only be accessed through pointers, + or in some cases, moved between mutable locations. + +Kinds can be supplied as _bounds_ on type parameters, like traits, in which +case the parameter is constrained to types satisfying that kind. + +By default, type parameters do not carry any assumed kind-bounds at all. When +instantiating a type parameter, the kind bounds on the parameter are checked to +be the same or narrower than the kind of the type that it is instantiated with. + +Sending operations are not part of the Rust language, but are implemented in +the library. Generic functions that send values bound the kind of these values +to sendable. + +# Memory and concurrency models + +Rust has a memory model centered around concurrently-executing _tasks_. Thus +its memory model and its concurrency model are best discussed simultaneously, +as parts of each only make sense when considered from the perspective of the +other. + +When reading about the memory model, keep in mind that it is partitioned in +order to support tasks; and when reading about tasks, keep in mind that their +isolation and communication mechanisms are only possible due to the ownership +and lifetime semantics of the memory model. + +## Memory model + +A Rust program's memory consists of a static set of *items*, a set of +[tasks](#tasks) each with its own *stack*, and a *heap*. Immutable portions of +the heap may be shared between tasks, mutable portions may not. + +Allocations in the stack consist of *slots*, and allocations in the heap +consist of *boxes*. + +### Memory allocation and lifetime + +The _items_ of a program are those functions, modules and types that have their +value calculated at compile-time and stored uniquely in the memory image of the +rust process. Items are neither dynamically allocated nor freed. + +A task's _stack_ consists of activation frames automatically allocated on entry +to each function as the task executes. A stack allocation is reclaimed when +control leaves the frame containing it. + +The _heap_ is a general term that describes boxes. The lifetime of an +allocation in the heap depends on the lifetime of the box values pointing to +it. Since box values may themselves be passed in and out of frames, or stored +in the heap, heap allocations may outlive the frame they are allocated within. + +### Memory ownership + +A task owns all memory it can *safely* reach through local variables, as well +as boxes and references. + +When a task sends a value that has the `Send` trait to another task, it loses +ownership of the value sent and can no longer refer to it. This is statically +guaranteed by the combined use of "move semantics", and the compiler-checked +_meaning_ of the `Send` trait: it is only instantiated for (transitively) +sendable kinds of data constructor and pointers, never including references. + +When a stack frame is exited, its local allocations are all released, and its +references to boxes are dropped. + +When a task finishes, its stack is necessarily empty and it therefore has no +references to any boxes; the remainder of its heap is immediately freed. + +### Memory slots + +A task's stack contains slots. + +A _slot_ is a component of a stack frame, either a function parameter, a +[temporary](#lvalues,-rvalues-and-temporaries), or a local variable. + +A _local variable_ (or *stack-local* allocation) holds a value directly, +allocated within the stack's memory. The value is a part of the stack frame. + +Local variables are immutable unless declared otherwise like: `let mut x = ...`. + +Function parameters are immutable unless declared with `mut`. The `mut` keyword +applies only to the following parameter (so `|mut x, y|` and `fn f(mut x: +Box, y: Box)` declare one mutable variable `x` and one immutable +variable `y`). + +Methods that take either `self` or `Box` can optionally place them in a +mutable slot by prefixing them with `mut` (similar to regular arguments): + +``` +trait Changer { + fn change(mut self) -> Self; + fn modify(mut self: Box) -> Box; +} +``` + +Local variables are not initialized when allocated; the entire frame worth of +local variables are allocated at once, on frame-entry, in an uninitialized +state. Subsequent statements within a function may or may not initialize the +local variables. Local variables can be used only after they have been +initialized; this is enforced by the compiler. + +### Boxes + +A _box_ is a reference to a heap allocation holding another value, which is +constructed by the prefix operator `box`. When the standard library is in use, +the type of a box is `std::owned::Box`. + +An example of a box type and value: + +``` +let x: Box = box 10; +``` + +Box values exist in 1:1 correspondence with their heap allocation, copying a +box value makes a shallow copy of the pointer. Rust will consider a shallow +copy of a box to move ownership of the value. After a value has been moved, +the source location cannot be used unless it is reinitialized. + +``` +let x: Box = box 10; +let y = x; +// attempting to use `x` will result in an error here +``` + +## Tasks + +An executing Rust program consists of a tree of tasks. A Rust _task_ consists +of an entry function, a stack, a set of outgoing communication channels and +incoming communication ports, and ownership of some portion of the heap of a +single operating-system process. + +### Communication between tasks + +Rust tasks are isolated and generally unable to interfere with one another's +memory directly, except through [`unsafe` code](#unsafe-functions). All +contact between tasks is mediated by safe forms of ownership transfer, and data +races on memory are prohibited by the type system. + +When you wish to send data between tasks, the values are restricted to the +[`Send` type-kind](#type-kinds). Restricting communication interfaces to this +kind ensures that no references move between tasks. Thus access to an entire +data structure can be mediated through its owning "root" value; no further +locking or copying is required to avoid data races within the substructure of +such a value. + +### Task lifecycle + +The _lifecycle_ of a task consists of a finite set of states and events that +cause transitions between the states. The lifecycle states of a task are: + +* running +* blocked +* panicked +* dead + +A task begins its lifecycle — once it has been spawned — in the +*running* state. In this state it executes the statements of its entry +function, and any functions called by the entry function. + +A task may transition from the *running* state to the *blocked* state any time +it makes a blocking communication call. When the call can be completed — +when a message arrives at a sender, or a buffer opens to receive a message +— then the blocked task will unblock and transition back to *running*. + +A task may transition to the *panicked* state at any time, due being killed by +some external event or internally, from the evaluation of a `panic!()` macro. +Once *panicking*, a task unwinds its stack and transitions to the *dead* state. +Unwinding the stack of a task is done by the task itself, on its own control +stack. If a value with a destructor is freed during unwinding, the code for the +destructor is run, also on the task's control stack. Running the destructor +code causes a temporary transition to a *running* state, and allows the +destructor code to cause any subsequent state transitions. The original task +of unwinding and panicking thereby may suspend temporarily, and may involve +(recursive) unwinding of the stack of a failed destructor. Nonetheless, the +outermost unwinding activity will continue until the stack is unwound and the +task transitions to the *dead* state. There is no way to "recover" from task +panics. Once a task has temporarily suspended its unwinding in the *panicking* +state, a panic occurring from within this destructor results in *hard* panic. +A hard panic currently results in the process aborting. + +A task in the *dead* state cannot transition to other states; it exists only to +have its termination status inspected by other tasks, and/or to await +reclamation when the last reference to it drops. + +# Runtime services, linkage and debugging + +The Rust _runtime_ is a relatively compact collection of Rust code that +provides fundamental services and datatypes to all Rust tasks at run-time. It +is smaller and simpler than many modern language runtimes. It is tightly +integrated into the language's execution model of memory, tasks, communication +and logging. + +### Memory allocation + +The runtime memory-management system is based on a _service-provider +interface_, through which the runtime requests blocks of memory from its +environment and releases them back to its environment when they are no longer +needed. The default implementation of the service-provider interface consists +of the C runtime functions `malloc` and `free`. + +The runtime memory-management system, in turn, supplies Rust tasks with +facilities for allocating releasing stacks, as well as allocating and freeing +heap data. + +### Built in types + +The runtime provides C and Rust code to assist with various built-in types, +such as arrays, strings, and the low level communication system (ports, +channels, tasks). + +Support for other built-in types such as simple types, tuples and enums is +open-coded by the Rust compiler. + +### Task scheduling and communication + +The runtime provides code to manage inter-task communication. This includes +the system of task-lifecycle state transitions depending on the contents of +queues, as well as code to copy values between queues and their recipients and +to serialize values for transmission over operating-system inter-process +communication facilities. + +### Linkage + +The Rust compiler supports various methods to link crates together both +statically and dynamically. This section will explore the various methods to +link Rust crates together, and more information about native libraries can be +found in the [ffi guide][ffi]. + +In one session of compilation, the compiler can generate multiple artifacts +through the usage of either command line flags or the `crate_type` attribute. +If one or more command line flag is specified, all `crate_type` attributes will +be ignored in favor of only building the artifacts specified by command line. + +* `--crate-type=bin`, `#[crate_type = "bin"]` - A runnable executable will be + produced. This requires that there is a `main` function in the crate which + will be run when the program begins executing. This will link in all Rust and + native dependencies, producing a distributable binary. + +* `--crate-type=lib`, `#[crate_type = "lib"]` - A Rust library will be produced. + This is an ambiguous concept as to what exactly is produced because a library + can manifest itself in several forms. The purpose of this generic `lib` option + is to generate the "compiler recommended" style of library. The output library + will always be usable by rustc, but the actual type of library may change from + time-to-time. The remaining output types are all different flavors of + libraries, and the `lib` type can be seen as an alias for one of them (but the + actual one is compiler-defined). + +* `--crate-type=dylib`, `#[crate_type = "dylib"]` - A dynamic Rust library will + be produced. This is different from the `lib` output type in that this forces + dynamic library generation. The resulting dynamic library can be used as a + dependency for other libraries and/or executables. This output type will + create `*.so` files on linux, `*.dylib` files on osx, and `*.dll` files on + windows. + +* `--crate-type=staticlib`, `#[crate_type = "staticlib"]` - A static system + library will be produced. This is different from other library outputs in that + the Rust compiler will never attempt to link to `staticlib` outputs. The + purpose of this output type is to create a static library containing all of + the local crate's code along with all upstream dependencies. The static + library is actually a `*.a` archive on linux and osx and a `*.lib` file on + windows. This format is recommended for use in situations such as linking + Rust code into an existing non-Rust application because it will not have + dynamic dependencies on other Rust code. + +* `--crate-type=rlib`, `#[crate_type = "rlib"]` - A "Rust library" file will be + produced. This is used as an intermediate artifact and can be thought of as a + "static Rust library". These `rlib` files, unlike `staticlib` files, are + interpreted by the Rust compiler in future linkage. This essentially means + that `rustc` will look for metadata in `rlib` files like it looks for metadata + in dynamic libraries. This form of output is used to produce statically linked + executables as well as `staticlib` outputs. + +Note that these outputs are stackable in the sense that if multiple are +specified, then the compiler will produce each form of output at once without +having to recompile. However, this only applies for outputs specified by the +same method. If only `crate_type` attributes are specified, then they will all +be built, but if one or more `--crate-type` command line flag is specified, +then only those outputs will be built. + +With all these different kinds of outputs, if crate A depends on crate B, then +the compiler could find B in various different forms throughout the system. The +only forms looked for by the compiler, however, are the `rlib` format and the +dynamic library format. With these two options for a dependent library, the +compiler must at some point make a choice between these two formats. With this +in mind, the compiler follows these rules when determining what format of +dependencies will be used: + +1. If a static library is being produced, all upstream dependencies are + required to be available in `rlib` formats. This requirement stems from the + reason that a dynamic library cannot be converted into a static format. + + Note that it is impossible to link in native dynamic dependencies to a static + library, and in this case warnings will be printed about all unlinked native + dynamic dependencies. + +2. If an `rlib` file is being produced, then there are no restrictions on what + format the upstream dependencies are available in. It is simply required that + all upstream dependencies be available for reading metadata from. + + The reason for this is that `rlib` files do not contain any of their upstream + dependencies. It wouldn't be very efficient for all `rlib` files to contain a + copy of `libstd.rlib`! + +3. If an executable is being produced and the `-C prefer-dynamic` flag is not + specified, then dependencies are first attempted to be found in the `rlib` + format. If some dependencies are not available in an rlib format, then + dynamic linking is attempted (see below). + +4. If a dynamic library or an executable that is being dynamically linked is + being produced, then the compiler will attempt to reconcile the available + dependencies in either the rlib or dylib format to create a final product. + + A major goal of the compiler is to ensure that a library never appears more + than once in any artifact. For example, if dynamic libraries B and C were + each statically linked to library A, then a crate could not link to B and C + together because there would be two copies of A. The compiler allows mixing + the rlib and dylib formats, but this restriction must be satisfied. + + The compiler currently implements no method of hinting what format a library + should be linked with. When dynamically linking, the compiler will attempt to + maximize dynamic dependencies while still allowing some dependencies to be + linked in via an rlib. + + For most situations, having all libraries available as a dylib is recommended + if dynamically linking. For other situations, the compiler will emit a + warning if it is unable to determine which formats to link each library with. + +In general, `--crate-type=bin` or `--crate-type=lib` should be sufficient for +all compilation needs, and the other options are just available if more +fine-grained control is desired over the output format of a Rust crate. + +# Appendix: Rationales and design tradeoffs + +*TODO*. + +# Appendix: Influences and further references + +## Influences + +> The essential problem that must be solved in making a fault-tolerant +> software system is therefore that of fault-isolation. Different programmers +> will write different modules, some modules will be correct, others will have +> errors. We do not want the errors in one module to adversely affect the +> behaviour of a module which does not have any errors. +> +> — Joe Armstrong + +> In our approach, all data is private to some process, and processes can +> only communicate through communications channels. *Security*, as used +> in this paper, is the property which guarantees that processes in a system +> cannot affect each other except by explicit communication. +> +> When security is absent, nothing which can be proven about a single module +> in isolation can be guaranteed to hold when that module is embedded in a +> system [...] +> +> — Robert Strom and Shaula Yemini + +> Concurrent and applicative programming complement each other. The +> ability to send messages on channels provides I/O without side effects, +> while the avoidance of shared data helps keep concurrent processes from +> colliding. +> +> — Rob Pike + +Rust is not a particularly original language. It may however appear unusual by +contemporary standards, as its design elements are drawn from a number of +"historical" languages that have, with a few exceptions, fallen out of favour. +Five prominent lineages contribute the most, though their influences have come +and gone during the course of Rust's development: + +* The NIL (1981) and Hermes (1990) family. These languages were developed by + Robert Strom, Shaula Yemini, David Bacon and others in their group at IBM + Watson Research Center (Yorktown Heights, NY, USA). + +* The Erlang (1987) language, developed by Joe Armstrong, Robert Virding, Claes + Wikström, Mike Williams and others in their group at the Ericsson Computer + Science Laboratory (Älvsjö, Stockholm, Sweden) . + +* The Sather (1990) language, developed by Stephen Omohundro, Chu-Cheow Lim, + Heinz Schmidt and others in their group at The International Computer + Science Institute of the University of California, Berkeley (Berkeley, CA, + USA). + +* The Newsqueak (1988), Alef (1995), and Limbo (1996) family. These + languages were developed by Rob Pike, Phil Winterbottom, Sean Dorward and + others in their group at Bell Labs Computing Sciences Research Center + (Murray Hill, NJ, USA). + +* The Napier (1985) and Napier88 (1988) family. These languages were + developed by Malcolm Atkinson, Ron Morrison and others in their group at + the University of St. Andrews (St. Andrews, Fife, UK). + +Additional specific influences can be seen from the following languages: + +* The structural algebraic types and compilation manager of SML. +* The attribute and assembly systems of C#. +* The references and deterministic destructor system of C++. +* The memory region systems of the ML Kit and Cyclone. +* The typeclass system of Haskell. +* The lexical identifier rule of Python. +* The block syntax of Ruby. + +[ffi]: guide-ffi.html +[plugin]: guide-plugin.html From ab24ffe21a742287ee12d5992d4c90e83abb374d Mon Sep 17 00:00:00 2001 From: Ignacio Corderi Date: Wed, 26 Nov 2014 16:52:36 -0800 Subject: [PATCH 2/2] Copied all the grammar productions from reference.md to grammar.md --- src/doc/grammar.md | 773 ++------------------------------------------- 1 file changed, 18 insertions(+), 755 deletions(-) diff --git a/src/doc/grammar.md b/src/doc/grammar.md index 6b700210201d4..c2cbb3ae3fb2f 100644 --- a/src/doc/grammar.md +++ b/src/doc/grammar.md @@ -683,254 +683,53 @@ return_expr : "return" expr ? ; # Type system -## Types - -Every slot, item and value in a Rust program has a type. The _type_ of a -*value* defines the interpretation of the memory holding it. +**FIXME:** is this entire chapter relevant here? Or should it all have been covered by some production already? -Built-in types and type-constructors are tightly integrated into the language, -in nontrivial ways that are not possible to emulate in user-defined types. -User-defined types have limited capabilities. +## Types ### Primitive types -The primitive types are the following: - -* The "unit" type `()`, having the single "unit" value `()` (occasionally called - "nil"). [^unittype] -* The boolean type `bool` with values `true` and `false`. -* The machine types. -* The machine-dependent integer and floating-point types. - -[^unittype]: The "unit" value `()` is *not* a sentinel "null pointer" value for - reference slots; the "unit" type is the implicit return type from functions - otherwise lacking a return type, and can be used in other contexts (such as - message-sending or type-parametric code) as a zero-size type.] +**FIXME:** grammar? #### Machine types -The machine types are the following: - -* The unsigned word types `u8`, `u16`, `u32` and `u64`, with values drawn from - the integer intervals [0, 2^8 - 1], [0, 2^16 - 1], [0, 2^32 - 1] and - [0, 2^64 - 1] respectively. - -* The signed two's complement word types `i8`, `i16`, `i32` and `i64`, with - values drawn from the integer intervals [-(2^(7)), 2^7 - 1], - [-(2^(15)), 2^15 - 1], [-(2^(31)), 2^31 - 1], [-(2^(63)), 2^63 - 1] - respectively. - -* The IEEE 754-2008 `binary32` and `binary64` floating-point types: `f32` and - `f64`, respectively. +**FIXME:** grammar? #### Machine-dependent integer types -The `uint` type is an unsigned integer type with the same number of bits as the -platform's pointer type. It can represent every memory address in the process. - -The `int` type is a signed integer type with the same number of bits as the -platform's pointer type. The theoretical upper bound on object and array size -is the maximum `int` value. This ensures that `int` can be used to calculate -differences between pointers into an object or array and can address every byte -within an object along with one byte past the end. +**FIXME:** grammar? ### Textual types -The types `char` and `str` hold textual data. - -A value of type `char` is a [Unicode scalar value]( -http://www.unicode.org/glossary/#unicode_scalar_value) (ie. a code point that -is not a surrogate), represented as a 32-bit unsigned word in the 0x0000 to -0xD7FF or 0xE000 to 0x10FFFF range. A `[char]` array is effectively an UCS-4 / -UTF-32 string. - -A value of type `str` is a Unicode string, represented as an array of 8-bit -unsigned bytes holding a sequence of UTF-8 codepoints. Since `str` is of -unknown size, it is not a _first class_ type, but can only be instantiated -through a pointer type, such as `&str` or `String`. +**FIXME:** grammar? ### Tuple types -A tuple *type* is a heterogeneous product of other types, called the *elements* -of the tuple. It has no nominal name and is instead structurally typed. - -Tuple types and values are denoted by listing the types or values of their -elements, respectively, in a parenthesized, comma-separated list. - -Because tuple elements don't have a name, they can only be accessed by -pattern-matching. - -The members of a tuple are laid out in memory contiguously, in order specified -by the tuple type. - -An example of a tuple type and its use: - -``` -type Pair<'a> = (int, &'a str); -let p: Pair<'static> = (10, "hello"); -let (a, b) = p; -assert!(b != "world"); -``` +**FIXME:** grammar? ### Array, and Slice types -Rust has two different types for a list of items: - -* `[T ..N]`, an 'array' -* `&[T]`, a 'slice'. - -An array has a fixed size, and can be allocated on either the stack or the -heap. - -A slice is a 'view' into an array. It doesn't own the data it points -to, it borrows it. - -An example of each kind: - -```{rust} -let vec: Vec = vec![1, 2, 3]; -let arr: [int, ..3] = [1, 2, 3]; -let s: &[int] = vec.as_slice(); -``` - -As you can see, the `vec!` macro allows you to create a `Vec` easily. The -`vec!` macro is also part of the standard library, rather than the language. - -All in-bounds elements of arrays, and slices are always initialized, and access -to an array or slice is always bounds-checked. +**FIXME:** grammar? ### Structure types -A `struct` *type* is a heterogeneous product of other types, called the -*fields* of the type.[^structtype] - -[^structtype]: `struct` types are analogous `struct` types in C, - the *record* types of the ML family, - or the *structure* types of the Lisp family. - -New instances of a `struct` can be constructed with a [struct -expression](#structure-expressions). - -The memory layout of a `struct` is undefined by default to allow for compiler -optimizations like field reordering, but it can be fixed with the -`#[repr(...)]` attribute. In either case, fields may be given in any order in -a corresponding struct *expression*; the resulting `struct` value will always -have the same memory layout. - -The fields of a `struct` may be qualified by [visibility -modifiers](#re-exporting-and-visibility), to allow access to data in a -structure outside a module. - -A _tuple struct_ type is just like a structure type, except that the fields are -anonymous. - -A _unit-like struct_ type is like a structure type, except that it has no -fields. The one value constructed by the associated [structure -expression](#structure-expressions) is the only value that inhabits such a -type. +**FIXME:** grammar? ### Enumerated types -An *enumerated type* is a nominal, heterogeneous disjoint union type, denoted -by the name of an [`enum` item](#enumerations). [^enumtype] - -[^enumtype]: The `enum` type is analogous to a `data` constructor declaration in - ML, or a *pick ADT* in Limbo. - -An [`enum` item](#enumerations) declares both the type and a number of *variant -constructors*, each of which is independently named and takes an optional tuple -of arguments. - -New instances of an `enum` can be constructed by calling one of the variant -constructors, in a [call expression](#call-expressions). - -Any `enum` value consumes as much memory as the largest variant constructor for -its corresponding `enum` type. - -Enum types cannot be denoted *structurally* as types, but must be denoted by -named reference to an [`enum` item](#enumerations). - -### Recursive types - -Nominal types — [enumerations](#enumerated-types) and -[structures](#structure-types) — may be recursive. That is, each `enum` -constructor or `struct` field may refer, directly or indirectly, to the -enclosing `enum` or `struct` type itself. Such recursion has restrictions: - -* Recursive types must include a nominal type in the recursion - (not mere [type definitions](#type-definitions), - or other structural types such as [arrays](#array,-and-slice-types) or [tuples](#tuple-types)). -* A recursive `enum` item must have at least one non-recursive constructor - (in order to give the recursion a basis case). -* The size of a recursive type must be finite; - in other words the recursive fields of the type must be [pointer types](#pointer-types). -* Recursive type definitions can cross module boundaries, but not module *visibility* boundaries, - or crate boundaries (in order to simplify the module system and type checker). - -An example of a *recursive* type and its use: - -``` -enum List { - Nil, - Cons(T, Box>) -} - -let a: List = List::Cons(7, box List::Cons(13, box List::Nil)); -``` +**FIXME:** grammar? ### Pointer types -All pointers in Rust are explicit first-class values. They can be copied, -stored into data structures, and returned from functions. There are two -varieties of pointer in Rust: - -* References (`&`) - : These point to memory _owned by some other value_. - A reference type is written `&type` for some lifetime-variable `f`, - or just `&'a type` when you need an explicit lifetime. - Copying a reference is a "shallow" operation: - it involves only copying the pointer itself. - Releasing a reference typically has no effect on the value it points to, - with the exception of temporary values, which are released when the last - reference to them is released. - -* Raw pointers (`*`) - : Raw pointers are pointers without safety or liveness guarantees. - Raw pointers are written as `*const T` or `*mut T`, - for example `*const int` means a raw pointer to an integer. - Copying or dropping a raw pointer has no effect on the lifecycle of any - other value. Dereferencing a raw pointer or converting it to any other - pointer type is an [`unsafe` operation](#unsafe-functions). - Raw pointers are generally discouraged in Rust code; - they exist to support interoperability with foreign code, - and writing performance-critical or low-level functions. - -The standard library contains additional 'smart pointer' types beyond references -and raw pointers. +**FIXME:** grammar? ### Function types -The function type constructor `fn` forms new function types. A function type -consists of a possibly-empty set of function-type modifiers (such as `unsafe` -or `extern`), a sequence of input types and an output type. - -An example of a `fn` type: - -``` -fn add(x: int, y: int) -> int { - return x + y; -} - -let mut x = add(5,7); - -type Binop<'a> = |int,int|: 'a -> int; -let bo: Binop = add; -x = bo(5,7); -``` +**FIXME:** grammar? ### Closure types -```{.ebnf .notation} +```antlr closure_type := [ 'unsafe' ] [ '<' lifetime-list '>' ] '|' arg-list '|' [ ':' bound-list ] [ '->' type ] procedure_type := 'proc' [ '<' lifetime-list '>' ] '(' arg-list ')' @@ -941,574 +740,38 @@ bound-list := bound | bound '+' bound-list bound := path | lifetime ``` -The type of a closure mapping an input of type `A` to an output of type `B` is -`|A| -> B`. A closure with no arguments or return values has type `||`. -Similarly, a procedure mapping `A` to `B` is `proc(A) -> B` and a no-argument -and no-return value closure has type `proc()`. - -An example of creating and calling a closure: - -```rust -let captured_var = 10i; - -let closure_no_args = || println!("captured_var={}", captured_var); - -let closure_args = |arg: int| -> int { - println!("captured_var={}, arg={}", captured_var, arg); - arg // Note lack of semicolon after 'arg' -}; - -fn call_closure(c1: ||, c2: |int| -> int) { - c1(); - c2(2); -} - -call_closure(closure_no_args, closure_args); - -``` - -Unlike closures, procedures may only be invoked once, but own their -environment, and are allowed to move out of their environment. Procedures are -allocated on the heap (unlike closures). An example of creating and calling a -procedure: - -```rust -let string = "Hello".to_string(); - -// Creates a new procedure, passing it to the `spawn` function. -spawn(proc() { - println!("{} world!", string); -}); - -// the variable `string` has been moved into the previous procedure, so it is -// no longer usable. - - -// Create an invoke a procedure. Note that the procedure is *moved* when -// invoked, so it cannot be invoked again. -let f = proc(n: int) { n + 22 }; -println!("answer: {}", f(20)); - -``` - ### Object types -Every trait item (see [traits](#traits)) defines a type with the same name as -the trait. This type is called the _object type_ of the trait. Object types -permit "late binding" of methods, dispatched using _virtual method tables_ -("vtables"). Whereas most calls to trait methods are "early bound" (statically -resolved) to specific implementations at compile time, a call to a method on an -object type is only resolved to a vtable entry at compile time. The actual -implementation for each vtable entry can vary on an object-by-object basis. - -Given a pointer-typed expression `E` of type `&T` or `Box`, where `T` -implements trait `R`, casting `E` to the corresponding pointer type `&R` or -`Box` results in a value of the _object type_ `R`. This result is -represented as a pair of pointers: the vtable pointer for the `T` -implementation of `R`, and the pointer value of `E`. - -An example of an object type: - -``` -trait Printable { - fn stringify(&self) -> String; -} - -impl Printable for int { - fn stringify(&self) -> String { self.to_string() } -} - -fn print(a: Box) { - println!("{}", a.stringify()); -} - -fn main() { - print(box 10i as Box); -} -``` - -In this example, the trait `Printable` occurs as an object type in both the -type signature of `print`, and the cast expression in `main`. +**FIXME:** grammar? ### Type parameters -Within the body of an item that has type parameter declarations, the names of -its type parameters are types: - -```ignore -fn map(f: |A| -> B, xs: &[A]) -> Vec { - if xs.len() == 0 { - return vec![]; - } - let first: B = f(xs[0].clone()); - let mut rest: Vec = map(f, xs.slice(1, xs.len())); - rest.insert(0, first); - return rest; -} -``` - -Here, `first` has type `B`, referring to `map`'s `B` type parameter; and `rest` -has type `Vec`, a vector type with element type `B`. +**FIXME:** grammar? ### Self types -The special type `self` has a meaning within methods inside an impl item. It -refers to the type of the implicit `self` argument. For example, in: - -``` -trait Printable { - fn make_string(&self) -> String; -} - -impl Printable for String { - fn make_string(&self) -> String { - (*self).clone() - } -} -``` - -`self` refers to the value of type `String` that is the receiver for a call to -the method `make_string`. +**FIXME:** grammar? ## Type kinds -Types in Rust are categorized into kinds, based on various properties of the -components of the type. The kinds are: - -* `Send` - : Types of this kind can be safely sent between tasks. - This kind includes scalars, boxes, procs, and - structural types containing only other owned types. - All `Send` types are `'static`. -* `Copy` - : Types of this kind consist of "Plain Old Data" - which can be copied by simply moving bits. - All values of this kind can be implicitly copied. - This kind includes scalars and immutable references, - as well as structural types containing other `Copy` types. -* `'static` - : Types of this kind do not contain any references (except for - references with the `static` lifetime, which are allowed). - This can be a useful guarantee for code - that breaks borrowing assumptions - using [`unsafe` operations](#unsafe-functions). -* `Drop` - : This is not strictly a kind, - but its presence interacts with kinds: - the `Drop` trait provides a single method `drop` - that takes no parameters, - and is run when values of the type are dropped. - Such a method is called a "destructor", - and are always executed in "top-down" order: - a value is completely destroyed - before any of the values it owns run their destructors. - Only `Send` types can implement `Drop`. - -* _Default_ - : Types with destructors, closure environments, - and various other _non-first-class_ types, - are not copyable at all. - Such types can usually only be accessed through pointers, - or in some cases, moved between mutable locations. - -Kinds can be supplied as _bounds_ on type parameters, like traits, in which -case the parameter is constrained to types satisfying that kind. - -By default, type parameters do not carry any assumed kind-bounds at all. When -instantiating a type parameter, the kind bounds on the parameter are checked to -be the same or narrower than the kind of the type that it is instantiated with. - -Sending operations are not part of the Rust language, but are implemented in -the library. Generic functions that send values bound the kind of these values -to sendable. +**FIXME:** this this probably not relevant to the grammar... # Memory and concurrency models -Rust has a memory model centered around concurrently-executing _tasks_. Thus -its memory model and its concurrency model are best discussed simultaneously, -as parts of each only make sense when considered from the perspective of the -other. - -When reading about the memory model, keep in mind that it is partitioned in -order to support tasks; and when reading about tasks, keep in mind that their -isolation and communication mechanisms are only possible due to the ownership -and lifetime semantics of the memory model. +**FIXME:** is this entire chapter relevant here? Or should it all have been covered by some production already? ## Memory model -A Rust program's memory consists of a static set of *items*, a set of -[tasks](#tasks) each with its own *stack*, and a *heap*. Immutable portions of -the heap may be shared between tasks, mutable portions may not. - -Allocations in the stack consist of *slots*, and allocations in the heap -consist of *boxes*. - ### Memory allocation and lifetime -The _items_ of a program are those functions, modules and types that have their -value calculated at compile-time and stored uniquely in the memory image of the -rust process. Items are neither dynamically allocated nor freed. - -A task's _stack_ consists of activation frames automatically allocated on entry -to each function as the task executes. A stack allocation is reclaimed when -control leaves the frame containing it. - -The _heap_ is a general term that describes boxes. The lifetime of an -allocation in the heap depends on the lifetime of the box values pointing to -it. Since box values may themselves be passed in and out of frames, or stored -in the heap, heap allocations may outlive the frame they are allocated within. - ### Memory ownership -A task owns all memory it can *safely* reach through local variables, as well -as boxes and references. - -When a task sends a value that has the `Send` trait to another task, it loses -ownership of the value sent and can no longer refer to it. This is statically -guaranteed by the combined use of "move semantics", and the compiler-checked -_meaning_ of the `Send` trait: it is only instantiated for (transitively) -sendable kinds of data constructor and pointers, never including references. - -When a stack frame is exited, its local allocations are all released, and its -references to boxes are dropped. - -When a task finishes, its stack is necessarily empty and it therefore has no -references to any boxes; the remainder of its heap is immediately freed. - ### Memory slots -A task's stack contains slots. - -A _slot_ is a component of a stack frame, either a function parameter, a -[temporary](#lvalues,-rvalues-and-temporaries), or a local variable. - -A _local variable_ (or *stack-local* allocation) holds a value directly, -allocated within the stack's memory. The value is a part of the stack frame. - -Local variables are immutable unless declared otherwise like: `let mut x = ...`. - -Function parameters are immutable unless declared with `mut`. The `mut` keyword -applies only to the following parameter (so `|mut x, y|` and `fn f(mut x: -Box, y: Box)` declare one mutable variable `x` and one immutable -variable `y`). - -Methods that take either `self` or `Box` can optionally place them in a -mutable slot by prefixing them with `mut` (similar to regular arguments): - -``` -trait Changer { - fn change(mut self) -> Self; - fn modify(mut self: Box) -> Box; -} -``` - -Local variables are not initialized when allocated; the entire frame worth of -local variables are allocated at once, on frame-entry, in an uninitialized -state. Subsequent statements within a function may or may not initialize the -local variables. Local variables can be used only after they have been -initialized; this is enforced by the compiler. - ### Boxes -A _box_ is a reference to a heap allocation holding another value, which is -constructed by the prefix operator `box`. When the standard library is in use, -the type of a box is `std::owned::Box`. - -An example of a box type and value: - -``` -let x: Box = box 10; -``` - -Box values exist in 1:1 correspondence with their heap allocation, copying a -box value makes a shallow copy of the pointer. Rust will consider a shallow -copy of a box to move ownership of the value. After a value has been moved, -the source location cannot be used unless it is reinitialized. - -``` -let x: Box = box 10; -let y = x; -// attempting to use `x` will result in an error here -``` - ## Tasks -An executing Rust program consists of a tree of tasks. A Rust _task_ consists -of an entry function, a stack, a set of outgoing communication channels and -incoming communication ports, and ownership of some portion of the heap of a -single operating-system process. - ### Communication between tasks -Rust tasks are isolated and generally unable to interfere with one another's -memory directly, except through [`unsafe` code](#unsafe-functions). All -contact between tasks is mediated by safe forms of ownership transfer, and data -races on memory are prohibited by the type system. - -When you wish to send data between tasks, the values are restricted to the -[`Send` type-kind](#type-kinds). Restricting communication interfaces to this -kind ensures that no references move between tasks. Thus access to an entire -data structure can be mediated through its owning "root" value; no further -locking or copying is required to avoid data races within the substructure of -such a value. - ### Task lifecycle - -The _lifecycle_ of a task consists of a finite set of states and events that -cause transitions between the states. The lifecycle states of a task are: - -* running -* blocked -* panicked -* dead - -A task begins its lifecycle — once it has been spawned — in the -*running* state. In this state it executes the statements of its entry -function, and any functions called by the entry function. - -A task may transition from the *running* state to the *blocked* state any time -it makes a blocking communication call. When the call can be completed — -when a message arrives at a sender, or a buffer opens to receive a message -— then the blocked task will unblock and transition back to *running*. - -A task may transition to the *panicked* state at any time, due being killed by -some external event or internally, from the evaluation of a `panic!()` macro. -Once *panicking*, a task unwinds its stack and transitions to the *dead* state. -Unwinding the stack of a task is done by the task itself, on its own control -stack. If a value with a destructor is freed during unwinding, the code for the -destructor is run, also on the task's control stack. Running the destructor -code causes a temporary transition to a *running* state, and allows the -destructor code to cause any subsequent state transitions. The original task -of unwinding and panicking thereby may suspend temporarily, and may involve -(recursive) unwinding of the stack of a failed destructor. Nonetheless, the -outermost unwinding activity will continue until the stack is unwound and the -task transitions to the *dead* state. There is no way to "recover" from task -panics. Once a task has temporarily suspended its unwinding in the *panicking* -state, a panic occurring from within this destructor results in *hard* panic. -A hard panic currently results in the process aborting. - -A task in the *dead* state cannot transition to other states; it exists only to -have its termination status inspected by other tasks, and/or to await -reclamation when the last reference to it drops. - -# Runtime services, linkage and debugging - -The Rust _runtime_ is a relatively compact collection of Rust code that -provides fundamental services and datatypes to all Rust tasks at run-time. It -is smaller and simpler than many modern language runtimes. It is tightly -integrated into the language's execution model of memory, tasks, communication -and logging. - -### Memory allocation - -The runtime memory-management system is based on a _service-provider -interface_, through which the runtime requests blocks of memory from its -environment and releases them back to its environment when they are no longer -needed. The default implementation of the service-provider interface consists -of the C runtime functions `malloc` and `free`. - -The runtime memory-management system, in turn, supplies Rust tasks with -facilities for allocating releasing stacks, as well as allocating and freeing -heap data. - -### Built in types - -The runtime provides C and Rust code to assist with various built-in types, -such as arrays, strings, and the low level communication system (ports, -channels, tasks). - -Support for other built-in types such as simple types, tuples and enums is -open-coded by the Rust compiler. - -### Task scheduling and communication - -The runtime provides code to manage inter-task communication. This includes -the system of task-lifecycle state transitions depending on the contents of -queues, as well as code to copy values between queues and their recipients and -to serialize values for transmission over operating-system inter-process -communication facilities. - -### Linkage - -The Rust compiler supports various methods to link crates together both -statically and dynamically. This section will explore the various methods to -link Rust crates together, and more information about native libraries can be -found in the [ffi guide][ffi]. - -In one session of compilation, the compiler can generate multiple artifacts -through the usage of either command line flags or the `crate_type` attribute. -If one or more command line flag is specified, all `crate_type` attributes will -be ignored in favor of only building the artifacts specified by command line. - -* `--crate-type=bin`, `#[crate_type = "bin"]` - A runnable executable will be - produced. This requires that there is a `main` function in the crate which - will be run when the program begins executing. This will link in all Rust and - native dependencies, producing a distributable binary. - -* `--crate-type=lib`, `#[crate_type = "lib"]` - A Rust library will be produced. - This is an ambiguous concept as to what exactly is produced because a library - can manifest itself in several forms. The purpose of this generic `lib` option - is to generate the "compiler recommended" style of library. The output library - will always be usable by rustc, but the actual type of library may change from - time-to-time. The remaining output types are all different flavors of - libraries, and the `lib` type can be seen as an alias for one of them (but the - actual one is compiler-defined). - -* `--crate-type=dylib`, `#[crate_type = "dylib"]` - A dynamic Rust library will - be produced. This is different from the `lib` output type in that this forces - dynamic library generation. The resulting dynamic library can be used as a - dependency for other libraries and/or executables. This output type will - create `*.so` files on linux, `*.dylib` files on osx, and `*.dll` files on - windows. - -* `--crate-type=staticlib`, `#[crate_type = "staticlib"]` - A static system - library will be produced. This is different from other library outputs in that - the Rust compiler will never attempt to link to `staticlib` outputs. The - purpose of this output type is to create a static library containing all of - the local crate's code along with all upstream dependencies. The static - library is actually a `*.a` archive on linux and osx and a `*.lib` file on - windows. This format is recommended for use in situations such as linking - Rust code into an existing non-Rust application because it will not have - dynamic dependencies on other Rust code. - -* `--crate-type=rlib`, `#[crate_type = "rlib"]` - A "Rust library" file will be - produced. This is used as an intermediate artifact and can be thought of as a - "static Rust library". These `rlib` files, unlike `staticlib` files, are - interpreted by the Rust compiler in future linkage. This essentially means - that `rustc` will look for metadata in `rlib` files like it looks for metadata - in dynamic libraries. This form of output is used to produce statically linked - executables as well as `staticlib` outputs. - -Note that these outputs are stackable in the sense that if multiple are -specified, then the compiler will produce each form of output at once without -having to recompile. However, this only applies for outputs specified by the -same method. If only `crate_type` attributes are specified, then they will all -be built, but if one or more `--crate-type` command line flag is specified, -then only those outputs will be built. - -With all these different kinds of outputs, if crate A depends on crate B, then -the compiler could find B in various different forms throughout the system. The -only forms looked for by the compiler, however, are the `rlib` format and the -dynamic library format. With these two options for a dependent library, the -compiler must at some point make a choice between these two formats. With this -in mind, the compiler follows these rules when determining what format of -dependencies will be used: - -1. If a static library is being produced, all upstream dependencies are - required to be available in `rlib` formats. This requirement stems from the - reason that a dynamic library cannot be converted into a static format. - - Note that it is impossible to link in native dynamic dependencies to a static - library, and in this case warnings will be printed about all unlinked native - dynamic dependencies. - -2. If an `rlib` file is being produced, then there are no restrictions on what - format the upstream dependencies are available in. It is simply required that - all upstream dependencies be available for reading metadata from. - - The reason for this is that `rlib` files do not contain any of their upstream - dependencies. It wouldn't be very efficient for all `rlib` files to contain a - copy of `libstd.rlib`! - -3. If an executable is being produced and the `-C prefer-dynamic` flag is not - specified, then dependencies are first attempted to be found in the `rlib` - format. If some dependencies are not available in an rlib format, then - dynamic linking is attempted (see below). - -4. If a dynamic library or an executable that is being dynamically linked is - being produced, then the compiler will attempt to reconcile the available - dependencies in either the rlib or dylib format to create a final product. - - A major goal of the compiler is to ensure that a library never appears more - than once in any artifact. For example, if dynamic libraries B and C were - each statically linked to library A, then a crate could not link to B and C - together because there would be two copies of A. The compiler allows mixing - the rlib and dylib formats, but this restriction must be satisfied. - - The compiler currently implements no method of hinting what format a library - should be linked with. When dynamically linking, the compiler will attempt to - maximize dynamic dependencies while still allowing some dependencies to be - linked in via an rlib. - - For most situations, having all libraries available as a dylib is recommended - if dynamically linking. For other situations, the compiler will emit a - warning if it is unable to determine which formats to link each library with. - -In general, `--crate-type=bin` or `--crate-type=lib` should be sufficient for -all compilation needs, and the other options are just available if more -fine-grained control is desired over the output format of a Rust crate. - -# Appendix: Rationales and design tradeoffs - -*TODO*. - -# Appendix: Influences and further references - -## Influences - -> The essential problem that must be solved in making a fault-tolerant -> software system is therefore that of fault-isolation. Different programmers -> will write different modules, some modules will be correct, others will have -> errors. We do not want the errors in one module to adversely affect the -> behaviour of a module which does not have any errors. -> -> — Joe Armstrong - -> In our approach, all data is private to some process, and processes can -> only communicate through communications channels. *Security*, as used -> in this paper, is the property which guarantees that processes in a system -> cannot affect each other except by explicit communication. -> -> When security is absent, nothing which can be proven about a single module -> in isolation can be guaranteed to hold when that module is embedded in a -> system [...] -> -> — Robert Strom and Shaula Yemini - -> Concurrent and applicative programming complement each other. The -> ability to send messages on channels provides I/O without side effects, -> while the avoidance of shared data helps keep concurrent processes from -> colliding. -> -> — Rob Pike - -Rust is not a particularly original language. It may however appear unusual by -contemporary standards, as its design elements are drawn from a number of -"historical" languages that have, with a few exceptions, fallen out of favour. -Five prominent lineages contribute the most, though their influences have come -and gone during the course of Rust's development: - -* The NIL (1981) and Hermes (1990) family. These languages were developed by - Robert Strom, Shaula Yemini, David Bacon and others in their group at IBM - Watson Research Center (Yorktown Heights, NY, USA). - -* The Erlang (1987) language, developed by Joe Armstrong, Robert Virding, Claes - Wikström, Mike Williams and others in their group at the Ericsson Computer - Science Laboratory (Älvsjö, Stockholm, Sweden) . - -* The Sather (1990) language, developed by Stephen Omohundro, Chu-Cheow Lim, - Heinz Schmidt and others in their group at The International Computer - Science Institute of the University of California, Berkeley (Berkeley, CA, - USA). - -* The Newsqueak (1988), Alef (1995), and Limbo (1996) family. These - languages were developed by Rob Pike, Phil Winterbottom, Sean Dorward and - others in their group at Bell Labs Computing Sciences Research Center - (Murray Hill, NJ, USA). - -* The Napier (1985) and Napier88 (1988) family. These languages were - developed by Malcolm Atkinson, Ron Morrison and others in their group at - the University of St. Andrews (St. Andrews, Fife, UK). - -Additional specific influences can be seen from the following languages: - -* The structural algebraic types and compilation manager of SML. -* The attribute and assembly systems of C#. -* The references and deterministic destructor system of C++. -* The memory region systems of the ML Kit and Cyclone. -* The typeclass system of Haskell. -* The lexical identifier rule of Python. -* The block syntax of Ruby. - -[ffi]: guide-ffi.html -[plugin]: guide-plugin.html