Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] Add a lower-level IR for C++ code generation #1233

Merged
merged 36 commits into from
Dec 20, 2022

Conversation

WardBrian
Copy link
Member

Closes #850, #354.

This PR introduces a relatively simple1 C++ representation for the code generation backend.

Rather than code generating by creating strings directly, we first lower the MIR to this representation and then print it out.

Benefits:

  1. Code generation is now a mostly-pure computation. With the exception of a few things like functor generation and tracking map-rect calls, there are no side effects in the code gen. Each function takes in some chunk of input and returns some portion of the C++ datatype.
  2. Separation of the code-generation logic and the pretty-printing. This means formatting is more consistent, and makes it impossible for some part of the code gen to miss a semicolon where other parts wouldn't. As an added plus, we don't need to pass printing callbacks around to get reasonable formatting of things like method bodies.
  3. Ability to re-use code and write expressively. For example, we can write OCaml code like
    let open Expression_syntax in
    let open Types in
    let vector = Constructor (row_vector Double, [Literal "3"]) in
    let values = [Literal "1"; Var "a"; Literal "3"] in
    (vector << values).@!("finished")
    which gets turned into the C++
    (Eigen::Matrix<double,1,-1>(3) << 1, a, 3).finished()
    I tried not to go over board on making syntax for this, it is more or less limited to doing method calls (the alternative would be (MethodCall (Parens (StreamInsertion (Constructor (row_vector Double, [Literal "3"]), [Literal "1"; Var "a"; Literal "3"]), "finished", [], []), which moves all the least-important information to the front of the code).

There is a lot of potential to add further processing of this generated C++ before it is stringified and output, but that is not part of this PR.

The code generated by this PR is intended to be semantically identical; whitespace, the ordering of some functions, and other superficial changes are common however. I recommend viewing the diffs with whitespace changes hidden. I have eyeballed all of them, but this PR is the first time some of it will actually be compiled to executable.

Submission Checklist

  • Run unit tests
  • Documentation
    • OR, no user-facing changes were made

Release notes

Refactored how C++ is code-generated in the compiler.

Copyright and Licensing

By submitting this pull request, the copyright holder is agreeing to
license the submitted work under the BSD 3-clause license (https://opensource.org/licenses/BSD-3-Clause)

Footnotes

  1. "Relatively simple" in that it 1) does not try to capture all of C++, just the portion of C++ we currently generate and 2) is a basic recursive data type, no two-level-type reasoning is needed at this level

@WardBrian WardBrian added big-exciting-project Large projects that we're excited about but might take a long time and possibly not end up working o cpp-codegen labels Jul 29, 2022
@WardBrian WardBrian requested a review from SteveBronder July 29, 2022 20:16
@WardBrian WardBrian linked an issue Jul 29, 2022 that may be closed by this pull request
Copy link
Contributor

@SteveBronder SteveBronder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only got about halfway through Cpp.ml today, but so far I like what I see! Mostly seeing a few places it would be nice to have more verbose names / names that more closely align with C++ verbage. But the overall scheme seems very nice so far

src/stan_math_backend/Cpp.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Cpp.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Cpp.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Cpp.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Cpp.ml Show resolved Hide resolved
src/stan_math_backend/Cpp.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Cpp.ml Show resolved Hide resolved
src/stan_math_backend/Cpp.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Cpp.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Cpp.ml Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Nov 4, 2022

Codecov Report

Merging #1233 (8aac0dd) into master (2044426) will decrease coverage by 1.28%.
The diff coverage is 88.10%.

❗ Current head 8aac0dd differs from pull request most recent head 55eb5dd. Consider uploading reports for the commit 55eb5dd to get more accurate results

@@            Coverage Diff             @@
##           master    #1233      +/-   ##
==========================================
- Coverage   88.49%   87.20%   -1.29%     
==========================================
  Files          63       64       +1     
  Lines        9262     9687     +425     
==========================================
+ Hits         8196     8448     +252     
- Misses       1066     1239     +173     
Impacted Files Coverage Δ
src/stan_math_backend/Transform_Mir.ml 96.61% <ø> (+1.29%) ⬆️
src/stan_math_backend/Cpp.ml 66.48% <66.48%> (ø)
src/stan_math_backend/Lower_expr.ml 90.00% <90.00%> (ø)
src/stan_math_backend/Lower_stmt.ml 95.31% <95.31%> (ø)
src/stan_math_backend/Lower_program.ml 98.69% <98.69%> (ø)
src/stan_math_backend/Lower_functions.ml 99.49% <99.49%> (ø)
src/stan_math_backend/Cpp_Json.ml 100.00% <100.00%> (ø)
src/stan_math_backend/Locations.ml 100.00% <100.00%> (ø)
src/stanc/stanc.ml 84.16% <100.00%> (ø)
src/middle/SizedType.ml 65.67% <0.00%> (-13.61%) ⬇️
... and 16 more

@WardBrian
Copy link
Member Author

@nhuurre - I talked to @SteveBronder and he doesn't have the bandwidth to review this soon. Do you feel comfortable reviewing a PR in this area, and would you mind?

Copy link
Collaborator

@nhuurre nhuurre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You say that this closes #354 but that issue is about MIR dump format, something this PR does not touch at all.

With the exception of a few things like functor generation and tracking map-rect calls, there are no side effects in the code gen.

I didn't see functor generation side effects; it's forward declarations that have side effects but those are fully contained in Lower_functions.ml so I wouldn't worry too much.

More annoying is the map_rect tracking and registration.
Do you think it would be reasonable to separate map_rect id assignment from code lowering?
For example, Locations.prepare_prog walks the whole program and collects and relabels all the location metadata. Maybe it could also transform map_rect calls

map_rect(fn, args...) -> map_rect(id, fn, args...)

and return a list of the ids.

README.md Outdated Show resolved Hide resolved
src/stan_math_backend/Cpp.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Cpp.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Cpp.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Cpp.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Lower_expr.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Lower_expr.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Lower_stmt.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Lower_stmt.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Locations.ml Outdated Show resolved Hide resolved
@WardBrian
Copy link
Member Author

You say that this closes #354 but that issue is about MIR dump format, something this PR does not touch at all.

My bad, I thought that was about the (nearly identical) issue we currently have in the C++ code gen for printing a lot of empty cuts.

map_rect

Giving that suggestion a try now

@WardBrian
Copy link
Member Author

@nhuurre I quite like the result of moving map_rect into a pre-processing phase. The end result is that we shouldn't have any global state left in the code gen, it should all be within the scope of one function/module

@WardBrian WardBrian requested a review from nhuurre December 15, 2022 15:55
src/stan_math_backend/Lower_expr.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Lower_expr.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Lower_program.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Cpp.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Cpp.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Numbering.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Numbering.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Numbering.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Cpp.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Cpp.ml Outdated Show resolved Hide resolved
@WardBrian WardBrian requested a review from nhuurre December 16, 2022 16:10
Copy link
Collaborator

@nhuurre nhuurre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of nitpicky comments but those are optional so I'll approve now.

src/stan_math_backend/Cpp.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Cpp.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Cpp.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Cpp.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Cpp.ml Outdated Show resolved Hide resolved
src/stan_math_backend/Lower_functions.ml Outdated Show resolved Hide resolved
@WardBrian
Copy link
Member Author

Thanks for the thorough review @nhuurre!

@SteveBronder - I believe merging will still be blocked so long as you have 'requested changes'. Do you mind clearing that?

Copy link
Contributor

@SteveBronder SteveBronder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@WardBrian WardBrian merged commit c983e15 into master Dec 20, 2022
@WardBrian WardBrian deleted the refactor/cpp-ir branch April 11, 2023 14:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
big-exciting-project Large projects that we're excited about but might take a long time and possibly not end up working o cpp-codegen
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Collate pretty printers for C++ and structure the code gen files
3 participants