Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] Add a lower-level IR for C++ code generation #1233

Merged
merged 36 commits into from
Dec 20, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
b2753b8
Add C++ IR
WardBrian Jul 29, 2022
6950b57
Re-implement codegen using IR
WardBrian Jul 29, 2022
cb7b9c3
Remove former codegen files
WardBrian Jul 29, 2022
d09ec12
Update test output
WardBrian Jul 29, 2022
0b0711a
Clean up
WardBrian Jul 29, 2022
eb52cb8
Fix whitespace sensitive js test
WardBrian Jul 29, 2022
00caf29
Dune promote whitespace changes
WardBrian Jul 29, 2022
d85e651
Fix newline issue in strings
WardBrian Jul 30, 2022
37875d9
Fix missing complex type and overloading templates
WardBrian Jul 30, 2022
922523d
Tweak output formatting, clean up name code
WardBrian Aug 1, 2022
47dc485
Doc comments
WardBrian Aug 2, 2022
83bd5ec
Merge branch 'master' into refactor/cpp-ir
WardBrian Sep 22, 2022
22038c9
Rename backend helper str_array
WardBrian Sep 29, 2022
0f88296
Merge branch 'master' into refactor/cpp-ir
WardBrian Sep 29, 2022
c40cf8e
Merge branch 'master' into refactor/cpp-ir
WardBrian Oct 12, 2022
eb1f2ce
Documentation
WardBrian Oct 13, 2022
1154271
Merge branch 'master' into refactor/cpp-ir
WardBrian Oct 14, 2022
298a65e
Merge branch 'master' into refactor/cpp-ir
WardBrian Oct 20, 2022
7ffde75
Changes per first round of review
WardBrian Oct 21, 2022
4c60589
Merge branch 'master' into refactor/cpp-ir
WardBrian Oct 24, 2022
2af9803
Merge branch 'master'
WardBrian Oct 24, 2022
8aa1188
Merge branch 'master' into refactor/cpp-ir
WardBrian Oct 24, 2022
667ec62
Dune promote
WardBrian Oct 24, 2022
0cff4a9
Fix typo
WardBrian Oct 24, 2022
45fdbb3
Minor renames
WardBrian Oct 25, 2022
e3e0772
Merge branch 'master' into refactor/cpp-ir
WardBrian Nov 2, 2022
9038698
Merge branch 'master' into refactor/cpp-ir
WardBrian Nov 4, 2022
e656722
Only rename in one place
WardBrian Nov 7, 2022
8aac0dd
Merge branch 'master' into refactor/cpp-ir
WardBrian Nov 7, 2022
59f0f12
Merge branch 'master' into refactor/cpp-ir
WardBrian Nov 8, 2022
9c21501
Merge branch 'master' into refactor/cpp-ir
WardBrian Dec 12, 2022
5d63be4
Changes per review
WardBrian Dec 15, 2022
48761f0
Move map_rect numbering to Locations
WardBrian Dec 15, 2022
fccc948
Move map_rect registration to Locations, rename Locations pass to Num…
WardBrian Dec 15, 2022
844439e
Changes per review
WardBrian Dec 16, 2022
55eb5dd
Changes per review
WardBrian Dec 16, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ flowchart TB
1. Analyze & optimize (MIR -> MIR)
1. Code generation (MIR -> [C++](src/stan_math_backend/Stan_math_code_gen.ml)) (or other outputs, like [Tensorflow](https://github.com/stan-dev/stan2tfp/)).

### The two central data structures
### The central data structures

1. `src/frontend/Ast.ml` defines the AST. The AST is intended to have a direct 1-1 mapping with the syntax, so there are things like parentheses being kept around.
The pretty-printer in the frontend uses the AST and attempts to keep user syntax the same while just adjusting whitespace.
Expand All @@ -97,10 +97,16 @@ The pretty-printer in the frontend uses the AST and attempts to keep user syntax

The AST intends to keep very close to Stan-level semantics and syntax in every way.

2. `src/middle/Program.ml` contains the MIR (Middle Intermediate Language - we're saving room at the bottom for later). `src/frontend/Ast_to_Mir.ml` performs the lowering and attempts to strip out as much Stan-specific semantics and syntax as possible, though this is still something of a work-in-progress.
2. `src/middle/Program.ml` contains the MIR (Middle Intermediate Language). `src/frontend/Ast_to_Mir.ml` performs the lowering and attempts to strip out as much Stan-specific semantics and syntax as possible, though this is still something of a work-in-progress.

The MIR uses the same two-level types idea to add metadata, notably expression types and autodiff levels as well as locations on many things. The MIR is used as the output data type from the frontend and the input for dataflow analysis, optimization (which also outputs MIR), and code generation.


3. `src/stan_math_backend/Cpp.ml` defines a minimal representation of C++ used in code generation.

This is intentionally simpler than both the above structures and than a true C++ AST and is tailored pretty specifically
to the C++ generated in our model class.

## Design goals
* **Multiple phases** - each with human-readable intermediate representations for easy debugging and optimization design.
* **Optimizing** - takes advantage of info known at the Stan language level. Minimize information we must teach users for them to write performant code.
Expand Down
43 changes: 43 additions & 0 deletions docs/cpp_ir.mld
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
{0 C++ Code Generation}

The main backend of the compiler is the "Stan Math" (C++) backend.
We represent C++ code with a data types and functions found in
{!module-Stan_math_backend.Cpp}.

We also define a sort of miniature embedded domain specific language (DSL)
for using these types. These helper functions and operators are all in sub-modules
of {!module-Stan_math_backend.Cpp}, for example {!module-Stan_math_backend.Cpp.Expression_syntax}.

These allow writing OCaml code which looks or feels more like the C++ it will generate. These
constructs should be used when they improve clarity, and avoided when they make the code harder to
read. When combined with good variable names, this can lead to code like
[ lp_accum__.@?(("add", [Var "lp__"])) ], which hopefully reads quite clearly as equivalent to
the C++ [lp_accum__.add(lp__)].


After a Stan program is lowered to this type, it can be printed to C++ using
{!module-Stan_math_backend.Cpp.Printing}. This module uses [Fmt], but keeps
the question of how C++ should be formatted separate from the question of what
the generated C++ {e is}.


{1 DSL Example}
For example, lets say one wanted to generate the expression
{[
(Eigen::Matrix<double,1,-1>(3) << 1, a, 3).finished()
]}

This could be written down as the literal OCaml type it is:
{[
(MethodCall (Parens (StreamInsertion (Constructor (Matrix (Double, 1, -1), [Literal "3"]), [Literal "1"; Var "a"; Literal "3"]), "finished", [], [])
]}

Or, using the DSL constructs, the same expression could be written

{[
let open Expression_syntax in
let open Types in
let vector = Constructor (row_vector Double, [Literal "3"]) in
let values = [Literal "1"; Var "a"; Literal "3"] in
(vector << values).@!("finished")
]}
5 changes: 4 additions & 1 deletion docs/index.mld
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,10 @@ to a new developer.

- {{!page-exposing_new_functions}
Page on exposing a new function of the Stan Math library}


- {{!page-cpp_ir}
Page on how we implement C++ code generation with a structured type and mini-DSL}

- {{:https://github.com/stan-dev/stanc3/wiki/Software-Engineering-best-practices}
Wiki on generic software best practices}

Expand Down
Loading