-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DWARF] Crate Debug Info incomplete? #88521
Comments
@ayermolo I'm not familiar with debuginfo generation - you'll probably get a more prompt response if you ask in T-compiler on Zulip: https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler |
I'm curious though why this has never come up before? It sounds like a lot of tools depend on this. |
Ok, I'll ask there. |
Note that modules, crates, and codegen units are all different. "Modules" in Rust only affect privacy, they don't affect codegen (it sounds like you were treating LLVM modules the same as Rust modules above, and they're not the same). "Crates" are "codegen units for the rust compiler", they're all compiled at once and there's various things you can do within a crate that you can't do outside it (e.g. implement traits for a type). But the rust compiler will often split up crates into multiple LLVM codegen units before passing them to LLVM, so that it can cache them more often: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/monomorphize/collector/index.html |
Can you give info on how to reproduce the problem? It sounds like you are using As a wild guess, maybe
That doesn't sound like something that DWARF consumers should ever need to know. |
Here's a reproduction using the xtask crate within rust-analyzer (6b77e32 with
The executable has 4 skeleton units pointing to the same .dwo:
Note that none of these contain any definitions with addresses. And Here's one of them:
I'm guessing those 4 units aren't really meant to be identical, so the duplicate DWO id is causing some to be lost. A reason for that is I think the full units are meant to have entries that correspond to the Disabling LTO or using thin LTO works okay in this instance. |
Sorry didn't get a chance to get to this before PTO. I asked someone to do it internally, if not I'll priorities once I get back. |
…davidtwco rustc_codegen_llvm: Give each codegen unit a unique DWARF name on all platforms, not just Apple ones. To avoid breaking split DWARF, we need to ensure that each codegen unit has a unique `DW_AT_name`. This is because there's a remote chance that different codegen units for the same module will have entirely identical DWARF entries for the purpose of the DWO ID, which would violate Appendix F ("Split Dwarf Object Files") of the DWARF 5 specification. LLVM uses the algorithm specified in section 7.32 "Type Signature Computation" to compute the DWO ID, which does not include any fields that would distinguish compilation units. So we must embed the codegen unit name into the `DW_AT_name`. Closes rust-lang#88521.
…davidtwco rustc_codegen_llvm: Give each codegen unit a unique DWARF name on all platforms, not just Apple ones. To avoid breaking split DWARF, we need to ensure that each codegen unit has a unique `DW_AT_name`. This is because there's a remote chance that different codegen units for the same module will have entirely identical DWARF entries for the purpose of the DWO ID, which would violate Appendix F ("Split Dwarf Object Files") of the DWARF 5 specification. LLVM uses the algorithm specified in section 7.32 "Type Signature Computation" to compute the DWO ID, which does not include any fields that would distinguish compilation units. So we must embed the codegen unit name into the `DW_AT_name`. Closes rust-lang#88521.
Hello.
To preface this. I know nothing about RUST and its compiler. I am coming to this issue on toolchain side BOLT/DWP and dealing with debug information that is produced.
High level summary:
I am seeing multiple Skeleton CUs that have exact same DWO ID. So for example with ThinLTO when Split dwarf is enabled in main binary there are multiple CUs that point to different .dwo files, but have the same DWO ID.
Looking in to those files in to .debug_info.dwo section. They are either exactly the same, or differ by DW_AT_linkage_name.
Although LLVM can be changed to include DW_AT_linkage_name as part of the hashing algorithm, it won't solve the issue fully.
Tools like BOLT (https://github.com/facebookincubator/BOLT), or llvm-dwp rely on DWO ID to be unique.
From DWARF 5 spec (when this was all standardized), but I think also applies to DWARF4
Digging a bit in to it:
Looking in to one of the cases where only difference was DW_AT_linkage_name.
There are multiple modules, crates in rust land?, that are created. They define the same templated functions, and import the same functions.
For example looking at LLVM IR of one of the modules and one of the functions:
We have a function:
alloc_in that is part of templated raw_vec
https://doc.rust-lang.org/src/alloc/raw_vec.rs.html#188
https://doc.rust-lang.org/src/alloc/raw_vec.rs.html#115
From LinkageName this is relavant?
pub mod bar { pub use foo_bar::*; }
In a different module it will have:
and on rust side:
pub mod bar2 { pub use foo_bar2::*; }
I haven't looked in to cases where debug information is identical, beyond that it didn't have any functions just type information {DW_TAG_namespace , DW_TAG_enumeration_type , DW_TAG_enumerator}.
To me it seems that there is debug information missing on some level that represents what crate this CU belongs.
Any thoughts?
The text was updated successfully, but these errors were encountered: