-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add un-inline support using addr2line #14
Conversation
540068e
to
57d92c3
Compare
Codecov Report
@@ Coverage Diff @@
## master #14 +/- ##
=========================================
+ Coverage 40.22% 44.6% +4.37%
=========================================
Files 7 7
Lines 532 630 +98
=========================================
+ Hits 214 281 +67
- Misses 318 349 +31
Continue to review full report at Codecov.
|
This is a cool change, and pretty much how I expected it to turn out. Left a bunch of inline comments. Should also be far faster than As for testing, I sadly don't have a good answer for that either. One way would be for us to include a binary in the repo that is a compiled C program that we also have profiles for. We'd just want to make sure the compiler does inline stuff in it, but then it should make for a decent, and consistent, test case. What do you think? |
Also, seeing all that inlining code really makes me wish that |
I'm not sure what you mean? From what I can see the example does only three notable things: mmap, string parsing and string printing. The mmap is is 2 lines of code plus a crate while the other two exist basically only to closely mimic binutils addr2line. It's unclear to me how this would help here. |
@main-- oh, sorry, I should have been a bit more specific. The
To me at least, that's very similar to the flow of |
Oh, @jasonrhansen, just making sure you saw my comment about testing. |
Also, to be clear, it's fine if you make multiple commits, and then we just squash them all at the end. Instead of force-pushing all the time :) |
@jonhoo, I did see your comment about testing and I agree with the general approach. Are you suggesting we write our own C program to use as a test case? Do you have any ideas of something fairly simple we could use? Also, sorry about the force-pushing. I'm about to push a large commit with all my changes. I probably should have broken it up into several commits. |
Yeah, I was thinking we'd write our own simple C program. Or a Rust program would probably also be totally fine. Something with no external dependencies like: #[inline(always)]
fn count_to(mut n: usize) {
while n != 0 {
n -= 1'
}
}
#[inline(always)]
fn double_inline() {
count_to(1_000_000);
}
#[inline(never)]
fn not_inlined() {
count_to(1_000_000);
}
fn main() {
count_to(1_000_000);
double_inline();
not_inlined();
} We'd do a |
As for force-pushing, don't worry about it. Going forward though, individual commits are easier to review, and then we can just squash it all at the end if we want! |
@jonhoo Oh, I see. What you're suggesting makes sense, but the reason I originally structured the API this way is to avoid pulling in heavy external dependencies that are not strictly needed. For instance Bringing a file into memory and then calling The frame iteration is basically a bog-standard for loop - except it has to be a The demangling is only implemented this way in the addr2line example because you can turn it off. In your example you may just use The crate already does a lot in terms of abstracting a complex task into simple parts. I agree that it could use some ergonomic improvements but consider that all of these papercuts at least have a reason. For instance the |
We've moved the discussion with @main-- to gimli-rs/addr2line#110 (comment) so we can keep this PR to the point. |
I finally started trying to test this, but I'm not sure how to proceed. I wrote the following simple program in C. inline void __attribute__((always_inline)) count_to(unsigned int n) {
for(int i = 0; i < n; i++);
}
inline void __attribute__((always_inline)) double_inline() {
count_to(1000000000);
}
static void __attribute__ ((noinline)) not_inlined() {
count_to(1000000000);
}
int main() {
count_to(1000000000);
double_inline();
not_inlined();
} Then did the following:
The problem I'm having is the --inline option doesn't seem to work with stackcollapse-perf.pl, or our implementation.
So then I decided to try addr2line directly and took the address from this line
Next I ran
Then I realized that addr2line is expecting the offset into the binary, so I ran
Much better! But now I don't know how to use addr2line with the addresses that are in the perf output since those aren't relative to the binary. Any ideas? |
So, the issue here is basically this one: ASLR + position-independent executables are the default, and that gives a random offset to all executables. I thought there was a way to work around this, but can't remember it off the top of my head. For testing, we could just compile with |
FWIW, I filed #31 |
Thanks for the suggestion to compile with |
\o/ I think those two are the last two questions! |
\o/ Thank you so much @jasonrhansen! |
This patch adds full support for the `--inline` and `--context` flags of `stackcollapse-perf` by relying on the `addr2line` crate to resolve symbol addresses instead of the names reported by perf, which _can_ lead to more informative frame names. Note that, on modern systems, address-space randomization means that we often do not have the "true" address of functions in the trace from perf script. This issue also plagues stackcollapse-perf. There are some possible workarounds (jonhoo#31), but nothing definite as of yet. For the time being we're aiming for parity. This patch also adds another set of tests that are _not_ from the upstream FrameGraph repository. Specifically, it compares the output of `stackcollapse-perf` with `--inline` and `--context` with ours on a simple binary that we've compiled and profiled without randomization. This patch _also_ adopts the `log` crate, and emits appropriate log messages whenever the input is malformed in some way. We explicitly do not return with an error, since we want to be liberal in what we accept (just like `stackcollapse-perf`).
This patch adds full support for the `--inline` and `--context` flags of `stackcollapse-perf` by relying on the `addr2line` crate to resolve symbol addresses instead of the names reported by perf, which _can_ lead to more informative frame names. Note that, on modern systems, address-space randomization means that we often do not have the "true" address of functions in the trace from perf script. This issue also plagues stackcollapse-perf. There are some possible workarounds (jonhoo#31), but nothing definite as of yet. For the time being we're aiming for parity. This patch also adds another set of tests that are _not_ from the upstream FrameGraph repository. Specifically, it compares the output of `stackcollapse-perf` with `--inline` and `--context` with ours on a simple binary that we've compiled and profiled without randomization. This patch _also_ adopts the `log` crate, and emits appropriate log messages whenever the input is malformed in some way. We explicitly do not return with an error, since we want to be liberal in what we accept (just like `stackcollapse-perf`).
This commit adds support for #5, and adds command line flags --inline and --context.
The problem is finding a good way to test it. The perl version has no tests for it. They have a comment that says "these are tricky since they use addr2line, whose output will vary based on the test
system's binaries and symbol tables." I agree that this makes it difficult. Any ideas on how to approach this would be helpful.