Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type resolution from DWARF #2034

Merged
merged 5 commits into from
Apr 13, 2022
Merged

Conversation

viktormalik
Copy link
Contributor

@viktormalik viktormalik commented Oct 12, 2021

When tracing userspace programs, this extension automatically resolves types from DWARF (if it is available). This has mostly 2 advantages:

  • bpftrace program doesn't need to define the types (or include the headers) used in the traced binary
  • uprobe arguments of struct type can be accessed by name

Version 2:
Parsing of structure types from DWARF is done in FieldAnalyser, which directly fills structs in StructManager. ClangParser may override these types if user defined some custom types or included some headers.

Version 1 (abandoned):
Since we're relying on Clang parser to do the actual type resolution, we need to dump types from DWARF into compilable C format. Even though there probably exist libraries that can do this (e.g. libdwarves), I decided to implement this by hand, to have more control over the supported types and to avoid yet another dependency.
We now have two sources of type information - BTF and DWARF. To avoid potential collisions, for now we use DWARF for userspace tracing and BTF for kernel tracing.

I also believe that this resolves #1742.

Checklist
  • Language changes are updated in man/adoc/bpftrace.adoc and if needed in docs/reference_guide.md
  • User-visible and non-trivial changes updated in CHANGELOG.md
  • The new behaviour is covered by tests

@viktormalik viktormalik force-pushed the dwarf-type-resolution branch from 7f29dbf to 088e696 Compare October 12, 2021 09:43
Copy link
Member

@danobi danobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale for converting DWARF to C and then parsing C to bpftrace-internal representation? I would have thought that going from DWARF -> bpftrace-internal representation ought to be faster (at runtime) and easier.

FWIW, the reason we go BTF -> C -> bpftrace-internal representation is b/c libbpf already had the C dumping API and it was easier to code from bpftrace's side.

@viktormalik
Copy link
Contributor Author

What's the rationale for converting DWARF to C and then parsing C to bpftrace-internal representation? I would have thought that going from DWARF -> bpftrace-internal representation ought to be faster (at runtime) and easier.

The reason is that the final type resolution is done by Clang parser. So even if we resolved part of the types from DWARF, we still need to dump them to C b/c their defs may be necessary for the parser (e.g. if the user used some of them in his custom types).

@danobi
Copy link
Member

danobi commented Oct 20, 2021

(e.g. if the user used some of them in his custom types).

Hmm, is this a likely scenario? What types could the user be defining that are not already available in DWARF? I'm not necessarily against going to C then to clang parser, but I think it should be a conscious tradeoff. If it's significantly less complex to skip C, then we ought to consider that. If it's similar complexity, then going to C has theoretical advantages. What do you think?

@viktormalik
Copy link
Contributor Author

(e.g. if the user used some of them in his custom types).

Hmm, is this a likely scenario? What types could the user be defining that are not already available in DWARF? I'm not necessarily against going to C then to clang parser, but I think it should be a conscious tradeoff. If it's significantly less complex to skip C, then we ought to consider that. If it's similar complexity, then going to C has theoretical advantages. What do you think?

I agree that this seems to be a very unlikely situation (in fact, I haven't come with a reasonable example so far). What I'd suggest is:

  • if there are no user-defined types, skip the ClangParser and resolve all from DWARF (and we can eventually do the same for BTF, too)
  • if there are some user-defined types or includes, use ClangParser with DWARF->C or BTF->C dumping

Does it make sense?

@danobi
Copy link
Member

danobi commented Oct 29, 2021

Sure, that sounds fine to me

@viktormalik viktormalik added the do-not-merge Changes are not ready to be merged into master yet label Dec 14, 2021
@viktormalik viktormalik force-pushed the dwarf-type-resolution branch 2 times, most recently from f0f6ce1 to f3e40b9 Compare March 29, 2022 11:43
@viktormalik
Copy link
Contributor Author

After some time, coming with version 2.

The main change is that parsing of structure types from DWARF is now done in FieldAnalyser, which directly fills structs in StructManager. ClangParser is not changed by this PR, so it will run if there is something to parse (i.e. user provided some type definitions or included some headers). Since ClangParser is run after FieldAnalyser, the effect of this is that user may override DWARF-parsed types, if he wishes to.

@viktormalik viktormalik removed the do-not-merge Changes are not ready to be merged into master yet label Mar 29, 2022
@viktormalik viktormalik requested review from danobi and fbs March 29, 2022 12:02
Copy link
Member

@danobi danobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very familiar w/ DWARF but nothing stands out to me. Great work!

@viktormalik viktormalik force-pushed the dwarf-type-resolution branch from f3e40b9 to 604a589 Compare April 12, 2022 06:16
Several refactorings and minor changes done to DwarfParser:
- refactor iterations over DWARF DIE children nodes
- pass a pointer to BPFtrace to DwarfParser - will be needed for struct
  fields filling
- factor out and introduce some useful DWARF parsing methods
Support for parsing array type information from DWARF and creating
appropriate SizedType.
Struct types (stored in StructManager) are filled directly from
FieldAnalyser. We also allow ClangParser to rewrite the type parsed from
DWARF (this is controlled by the Struct::allow_override field)

For now, bitfields are not supported and let on ClangParser to resolve.
Introduces both unit and runtime tests.
@viktormalik viktormalik force-pushed the dwarf-type-resolution branch from 604a589 to 3c1e069 Compare April 12, 2022 06:27
@fbs fbs merged commit cd9c89e into bpftrace:master Apr 13, 2022
@viktormalik viktormalik deleted the dwarf-type-resolution branch May 3, 2022 04:43
@viktormalik viktormalik mentioned this pull request Jan 30, 2025
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support argument access by name, show arguments in probe list
3 participants