Improve performance by parsing function contents lazily #178

philipc · 2020-06-28T03:16:37Z

This is a significant rebase of the great work done by @mstange in #176 . Closes #176 .

The description from that PR still applies:

There are now two levels of function parsing laziness.

The first level creates only the function ranges for "outer functions" (DW_TAG_subprogram) and writes down each outer function's dw_die_offset. By necessity, when parsing the list, we need to iterate over all abbrevs and attributes of the unit, including the ones for any nested tags for inlined functions, but we try to keep the work we do for them to a minimum.

The second level is done once we look up an address that falls into an outer function's address range. At that point, we parse the entries for that outer function again, starting at the recorded dw_die_offset. We parse the function name and all functions that are inlined into it, with their names and other data.

This two-level laziness automatically makes it so that we don't parse the names of DW_TAG_subprogram instances that don't have address ranges.

Reduced allocations for inline functions: For nested inline functions, we no longer have a Vec of child inlines on each inline function. Instead, we now have one Vec with all descendant inline_address_ranges on the OuterFunction.

We do a binary search when looking up the inline functions for an address within an outer function.

Differences from that PR:

cleaner commit history
new benchmarks
use LazyCell instead of Rc/RefCell, since this makes the code easier to understand
less factoring out of attribute parsing, since this slightly affects performance
cosmetic changes to minimize differences to master

This PR was written by completely rewriting the commits instead of doing a normal rebase. Hopefully I haven't left out anything significant.

@mstange Could you please review this, and check against the benchmarks you were using? And thanks again for your work on this.

bjorn3 · 2020-06-28T06:28:43Z

You can use Co-authored-by: https://help.github.com/en/github/committing-changes-to-your-project/creating-a-commit-with-multiple-authors

We previously had no clean way of indicating that there were no frames, and this would have become harder to support for future changes. Co-authored-by: Markus Stange <[email protected]>

mstange · 2020-07-02T05:27:10Z

I'm reviewing this now.

mstange

Looks great! Just one or two comments that need adjusting for the different variable names.

Thanks!

src/lib.rs

mstange · 2020-07-02T06:20:25Z

src/lib.rs

            .as_ref()
-            .map_err(Error::clone)
+            .map_err(Error::clone)?


Oh good, this is better than what I did because you store the error, rather than re-parsing errored functions on every call.

mstange · 2020-07-02T06:34:07Z

src/lib.rs

+                    gimli::DW_TAG_subprogram => {
+                        Function::skip(entries, abbrev, next_depth)?;
+                    }


Interesting. Yeah, I think you're handling this more correctly than I was. I think my PR was siphoning inlines from nested subprograms into the outer subprogram.

And use only a single list of inlined function addresses per function. This reduces the overall number of allocations, and will allow lazy parsing of inlined functions in future. Since we now need to search through a larger list, we must use a binary search to retain query performance. Co-authored-by: Markus Stange <[email protected]>

Co-authored-by: Markus Stange <[email protected]>

Currently the same as function parsing.

Initially only parse the function address ranges. Delay parsing other details (name and inlined functions) until a probe matches. name before ns/iter after ns/iter diff ns/iter diff % speedup context_new_and_query_location_rc 3,289,458 2,797,820 -491,638 -14.95% x 1.18 context_new_and_query_location_slice 1,387,033 1,092,108 -294,925 -21.26% x 1.27 context_new_and_query_with_functions_rc 3,247,986 2,864,962 -383,024 -11.79% x 1.13 context_new_and_query_with_functions_slice 1,423,648 1,085,133 -338,515 -23.78% x 1.31 context_new_parse_functions_rc 53,668,984 23,596,242 -30,072,742 -56.03% x 2.27 context_new_parse_functions_slice 40,013,306 18,342,944 -21,670,362 -54.16% x 2.18 context_new_parse_inlined_functions_rc 54,060,518 53,382,830 -677,688 -1.25% x 1.01 context_new_parse_inlined_functions_slice 40,016,318 40,217,181 200,863 0.50% x 1.00 context_new_parse_lines_rc 12,129,688 11,873,223 -256,465 -2.11% x 1.02 context_new_parse_lines_slice 7,494,676 7,534,801 40,125 0.54% x 0.99 context_new_rc 2,304,944 2,365,795 60,851 2.64% x 0.97 context_new_slice 696,658 681,423 -15,235 -2.19% x 1.02 context_query_location_rc 1,034,087 1,004,797 -29,290 -2.83% x 1.03 context_query_location_slice 997,554 986,323 -11,231 -1.13% x 1.01 context_query_with_functions_rc 3,167,282 3,233,303 66,021 2.08% x 0.98 context_query_with_functions_slice 2,865,103 2,839,624 -25,479 -0.89% x 1.01 Co-authored-by: Markus Stange <[email protected]>

Cleanup empty FrameIter handling

fe51674

We previously had no clean way of indicating that there were no frames, and this would have become harder to support for future changes. Co-authored-by: Markus Stange <[email protected]>

philipc force-pushed the pr-176-rework branch 2 times, most recently from d50cb93 to 8789de0 Compare June 30, 2020 06:47

philipc mentioned this pull request Jul 2, 2020

Many addresses are missing file + line information #174

Closed

mstange approved these changes Jul 2, 2020

View reviewed changes

philipc and others added 4 commits July 2, 2020 18:04

Factor out range attribute handling

609e941

Co-authored-by: Markus Stange <[email protected]>

Add benchmarks for inlined function parsing

135b691

Currently the same as function parsing.

philipc force-pushed the pr-176-rework branch from 8789de0 to 4464464 Compare July 2, 2020 08:05

philipc merged commit 27f1ebe into gimli-rs:master Jul 2, 2020

philipc deleted the pr-176-rework branch July 2, 2020 08:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance by parsing function contents lazily #178

Improve performance by parsing function contents lazily #178

philipc commented Jun 28, 2020

bjorn3 commented Jun 28, 2020

mstange commented Jul 2, 2020

mstange left a comment

mstange Jul 2, 2020

mstange Jul 2, 2020

Improve performance by parsing function contents lazily #178

Improve performance by parsing function contents lazily #178

Conversation

philipc commented Jun 28, 2020

bjorn3 commented Jun 28, 2020

mstange commented Jul 2, 2020

mstange left a comment

Choose a reason for hiding this comment

mstange Jul 2, 2020

Choose a reason for hiding this comment

mstange Jul 2, 2020

Choose a reason for hiding this comment