Improve ANTLR Parser #2343

jackkoenig · 2021-08-31T21:48:18Z

2 Commits for 2 changes:

Factor references into their own parse rules and remove left-recursion
Add parser Listener that turns CST into AST on the fly and nulls out CST to save memory

Performance Enhancement

Time required to parse and emit CHIRRTL

design	master (.fir)	this PR (.fir)	.pb on both
small	5.3 s	5.3 s	2.5 s
medium	35 s	33 s	12.7 s
large	170 s	124 s	41 s

Note a huge gain for performance, but a wash or strict improvement. Still much slower than .pb parsing but could probably be helped by more (or any) parallelism.

Memory Use Enhancement

Required heap to parse and emit CHIRRTL

design	master (.fir)	this PR (.fir)	.pb on both
small	800 M	500 M	300 M
medium	9 G	4.6 G	3.4 G
large	42 G	13.5 G	12 G

Still not quite as good as .pb but a massive improvement. In profiling, the main objects being created in parsing are refs and subrefs, so perhaps we tweak the parsing rules again to not fully parse them and maybe parse them as part of a 2nd phase / during the CST => AST step?

Contributor Checklist

[None] Did you add Scaladoc to every public function/method?
[NA] Did you update the FIRRTL spec to include every new feature/behavior?
Did you add at least one test demonstrating the PR?
Did you delete any extraneous printlns/debugging code?
Did you specify the type of improvement?
Did you state the API impact?
Did you specify the code generation impact?
Did you request a desired merge strategy?
Did you add text to be included in the Release Notes for this change?

Type of Improvement

performance improvement
new feature/API

API Impact

Technically the generated Listener code (not the one I added, but the stuff generated by ANTLR) is public. I don't think it should be considered part of the FIRRTL public API but it's possible people would use it directly. We should probably move these files to package firrtl.internal.antlr or something to stop users from using them.

Backend Code Generation Impact

No impact

Desired Merge Strategy

Rebase

Release Notes

Improve performance and vastly reduce memory usage of .fir parser

Reviewer Checklist (only modified by reviewer)

Did you add the appropriate labels?
Did you mark the proper milestone (1.2.x, 1.3.0, 1.4.0) ?
Did you review?
Did you check whether all relevant Contributor checkboxes have been checked?
Did you mark as Please Merge?

seldridge

I'm good with anything here as long as it passes regressions. Thanks for hacking on parsing time, Jack!

jackkoenig · 2021-12-01T19:27:13Z

The failures are just build issues in the formal equivalence flow, not real failures, figuring out a fix.

The classes should not really be part of the firrtl public API to begin with and they cause issues during ScalaDoc generation.

Tweak the grammar to handle references without left-recursion. Also split references and subreferences out from the regular expression rule to make their parsing more efficient.

The ANTLR-generated concrete syntax tree (CST) takes up much more memory than the parsed .fir file. By using a Listener, we can construct the FIRRTL AST live with CST construction and null out the CST as we consume pieces of it. Not only does this improve performance, it drastically reduces max memory use for the parser.

jackkoenig force-pushed the improve-parser branch from 24646c6 to 83ed918 Compare August 31, 2021 23:16

jackkoenig force-pushed the improve-parser branch 2 times, most recently from e9c9393 to 28d687a Compare December 1, 2021 18:55

seldridge approved these changes Dec 1, 2021

View reviewed changes

jackkoenig added 4 commits December 1, 2021 11:40

Make formal_equiv.sh robust to changes to ANTLR & Protobuf

5ae1d25

Ignore firrtl.antlr package in API doc generation

5a85e21

The classes should not really be part of the firrtl public API to begin with and they cause issues during ScalaDoc generation.

Handle references better in ANTLR Parser

64a0ca2

Tweak the grammar to handle references without left-recursion. Also split references and subreferences out from the regular expression rule to make their parsing more efficient.

jackkoenig force-pushed the improve-parser branch from 28d687a to 17250fb Compare December 1, 2021 19:41

jackkoenig merged commit b14ed79 into master Dec 1, 2021

jackkoenig deleted the improve-parser branch December 1, 2021 20:04

jackkoenig added this to the 1.5.0 milestone Dec 1, 2021

jackkoenig mentioned this pull request May 18, 2022

Add '--dump-protobuf' and DumpProtoBufAnnotation chipsalliance/rocket-chip#2845

Closed

jackkoenig mentioned this pull request Sep 12, 2022

Make the Parser handle errors more gracefully #2549

Merged

12 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve ANTLR Parser #2343

Improve ANTLR Parser #2343

jackkoenig commented Aug 31, 2021 •

edited

Loading

seldridge left a comment

jackkoenig commented Dec 1, 2021

Improve ANTLR Parser #2343

Improve ANTLR Parser #2343

Conversation

jackkoenig commented Aug 31, 2021 • edited Loading

Performance Enhancement

Memory Use Enhancement

Contributor Checklist

Type of Improvement

API Impact

Backend Code Generation Impact

Desired Merge Strategy

Release Notes

Reviewer Checklist (only modified by reviewer)

seldridge left a comment

Choose a reason for hiding this comment

jackkoenig commented Dec 1, 2021

jackkoenig commented Aug 31, 2021 •

edited

Loading