-
Notifications
You must be signed in to change notification settings - Fork 93
FAQ
magic-trace reconstructs stack frames from a history of your program's control flow. That job is straightforward if your program creates and closes stack frames with call
and ret
instructions. The actual constraint here is something along the lines of: all the transfer of control flow between functions must be limited to call
s, ret
s, and tail-position jmp
s. There's an impedance mismatch between that rule and constructs like C's longjmp
or C++ exceptions.
We've done some work to make some popular exotic situations behave well, like most exceptions in OCaml. But, in general, magic-trace has limited support for custom control flow. If you do something fancy in a trace, your stack frames will look a little wonky. There's no general answer for this, we'll need to add some explicit code in magic-trace for every language and every runtime's custom control flow. We appreciate any incremental progress towards that goal from the community.
For what it's worth, the bottom-most stack frame does not have this problem. That's always trustworthy regardless of the control flow you used to get there.
You might be thinking: "perf
doesn't have this problem, why does magic-trace?". The reason is that perf
samples your program at regular intervals and walks the stack. magic-trace doesn't have that luxury. It doesn't actually walk the stack, it merely gets a copy of your program's control flow and reconstructs stack frames from that. perf
in LBR mode has the same caveats.
magic-trace was unable to determine the function name. The root cause is usually that your program is missing debug symbols. Some common reasons that can happen are:
- Your program was compiled without debug symbols.
- You're trying to trace into a globally installed library which doesn't have debug symbols. You may need to install a
-dbg
variant of the library from your package manager. - There's bug in magic-trace. File an issue if you think so!
That's probably where you entered into the kernel. If you want to peek behind that curtain, enable kernel tracing by passing -trace-include-kernel
to magic-trace.
If you've enabled kernel tracing, the next most likely reason is that your process context switched out. You should be able to tell, because the nearest stack frames will mention scheduling-related terms in their names.
There's other, less likely reasons this can happen, too. We don't know all of them, but they're for relatively niche reasons like attempting to trace into a secure enclave.
Intel PT generated an "OVF packet". The Intel Software Developer's Manual says:
It's a little vague, but based on our reading of this, it happens when the application + Intel PT use more memory bandwidth than what's available. When that happens, the application takes priority and Intel PT drops packets. People have also noticed drops around C-state transitions, but I haven't been able to find any documentation from Intel to corroborate that.
When magic-trace sees an overflow, it clears all stack frames and the generated trace may look discontinuous around that point. Please file an issue if this looks severely broken. We do try hard to behave right in this scenario and it's something we've screwed up before.
They're events that took "zero" time. Of course, nothing takes zero time, this is an artifact of how Intel PT works. Intel PT only provides timing updates every few (5-ish) events. So if you trace a function call and return before any timing updates are sent, it looks to magic-trace like the event took no time at all.
We send a signal to a perf
process, scheduling, etc. We could filter it out, but figured that most users would rather have the extra data than not have it. There's a search bar at the top of Perfetto if you're having trouble finding your stop symbol.
The time is sampled in two different locations. The stack timeline view is reconstructed from timestamps produced by Intel PT, while the snapshot markers come from the timestamp associated with the breakpoint hit event magic-trace receives from perf
. In our testing, the skid between these two appears to be ~3-4us.
Maybe that function was inlined? Tell your compiler not to inline it. If that still doesn't work, try to create a minimal reproducible example of the problem and file an issue.
�Two reasons:
-
We want to work with people that care about software and its performance. If you're reading this, you should at least consider applying to work at Jane Street. We are producers and consumers of tools like this, and we'd love for you to join us.
-
We need help from the community to really make magic-trace shine. Intel PT is a relatively niche feature; it suffers from rough edges, lack of documentation, and general disuse by the software development community. We hope that if more people see how useful it is, more people will work to improve it.