Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Rust attribute to support embedding debugger visualizers #3191

Merged
merged 26 commits into from
Apr 11, 2022

Conversation

ridwanabdillahi
Copy link
Contributor

@ridwanabdillahi ridwanabdillahi commented Nov 3, 2021

This RFC adds support for a new Rust attribute that will embed a debugger visualizer into a PDB/ELF.

Internals thread

Rendered

@gilescope
Copy link

I can totally see people wanting to derive natvis files using (proc) macros.

@ehuss ehuss added T-cargo Relevant to the Cargo team, which will review and decide on the RFC. T-compiler Relevant to the compiler team, which will review and decide on the RFC. labels Nov 4, 2021
@Diggsey
Copy link
Contributor

Diggsey commented Nov 6, 2021

I'm a bit wary of adding super platform-specific stuff to Cargo.toml (and this is coming from a windows user...)

I think it would be useful to look at prior art on other platforms, to see if there's some common denominator there.

It looks like natvis is also supported with GDB in vscode, although I don't know if there's any way to embed natvis files into the debug info in that case.

@gilescope
Copy link

gilescope commented Nov 6, 2021 via email

@ridwanabdillahi
Copy link
Contributor Author

ridwanabdillahi commented Nov 8, 2021

Idk, quite often it’s one way for windows and one way for unix systems. What matters is being able to have a better debug experience. If that means we have to do two things to cover all platforms that’s not too arduous.

I agree with this as well. Other platforms do not have a way to embed this kind of debug information automatically. VSCode uses MIEngine which only has a subset of the Natvis framework ported to it. This applies natvis to types within the debugger but the natvis is not embedded into the debug info. This means having to find the natvis for a specific version of a crate while debugging to get the correct behavior. This would put onus on the Rust developer who is consuming said crate to find the correct version of natvis for the version of the crate they are using. I believe having this solution for Windows is a great first step since these platforms are extremely different in how they support debugger visualizations.

text/0000-natvis.md Outdated Show resolved Hide resolved
text/0000-natvis.md Outdated Show resolved Hide resolved
text/0000-natvis.md Outdated Show resolved Hide resolved
text/0000-natvis.md Outdated Show resolved Hide resolved
text/0000-natvis.md Outdated Show resolved Hide resolved
text/0000-natvis.md Outdated Show resolved Hide resolved
text/0000-natvis.md Outdated Show resolved Hide resolved
text/0000-natvis.md Outdated Show resolved Hide resolved
text/0000-natvis.md Outdated Show resolved Hide resolved
text/0000-natvis.md Outdated Show resolved Hide resolved
@nrc
Copy link
Member

nrc commented Nov 15, 2021

So, I think it is a great idea to support natvis in an ergonomic way. Even though natvis is specific to certain platforms/debuggers and so is intrinsically not cross-platform, I think it is important for this RFC to consider other platforms so that we are sure that it is possible for crates to support multiple formats for different platforms, ideally using a single mechanism or very similar mechanisms (not the same visualisation format, but the same mechanism for Cargo to discover the visualisations and notify them to rustc, maybe).

On a lower level, how much does rustc need to do? Is it literally just passing filenames to the linker? Or does it need to do some symbol (de-)mangling? Integration with debuginfo support? How exactly will natvis files be saved in the metadata?

Finally, I much prefer the auto-discovery alternative. It seems much more inline with Cargo's 'convention over configuration' philosophy - we do not list source files, for example.

@sivadeilra
Copy link
Contributor

I think it is important for this RFC to consider other platforms so that we are sure that it is possible for crates to support multiple formats for different platforms

Hi, @nrc. I'm one of the co-authors of this RFC. Can you be more specific about this request? The other visualization formats supported by other debuggers are quite different in how they are implemented, packaged, and used. This RFC doesn't prevent implementing support for those debugger visualizations. In fact, it proposes a filesystem schema specifically for avoiding collisions, by using .natvis file extension that are based on the root module filename for a given crate. This works for the primary crate, unit test binary, integration test binaries, etc.

If there is something more to do here, we are certainly open to that. To me, it appears that the main requirement is not to interfere with implementing support for other debug formats, and I think this proposal meets that requirement.

Because different debugger visualizers are so different in how they work, I don't see much of an opportunity to "use a single mechanism or very similar mechanism". Again, I'm open to that, but I just don't see any reusability here.

Is it literally just passing filenames to the linker?

Yes, that's pretty much all there is to it. The only complexity here lies in the requirement to propagate these filenames across the dependency graph. That is, when you invoke the linker, you need to pass the linker options for all crates in the dependency graph, not just the current crate.

In my original prototype implementation, I did this purely in Cargo, because Cargo knows the entire dependency graph and so it could walk it and find all of the NatVis files. However, when I discussed this design with some folks on the Cargo channel on Zulip, they felt that this was not a good approach because it would not work correctly for build system that were directly driving rustc, rather than using Cargo. They suggested the approach of bundling the NatVis files into metadata fields, so that the final rustc invocation could just inspect metadata, pull out the NatVis files, and pass them to the linker. That's what we propose to do in this RFC.

We also favor auto-discovery. However, like many options in Cargo, I think the developer should still be able to override auto-discovery.

@nrc
Copy link
Member

nrc commented Nov 16, 2021

Hi, @nrc. I'm one of the co-authors of this RFC. Can you be more specific about this request? The other visualization formats supported by other debuggers are quite different in how they are implemented, packaged, and used. This RFC doesn't prevent implementing support for those debugger visualizations. In fact, it proposes a filesystem schema specifically for avoiding collisions, by using .natvis file extension that are based on the root module filename for a given crate. This works for the primary crate, unit test binary, integration test binaries, etc.

Hi, so I'm personally not familiar with how any debuggers support visualisations, so it would be useful for me to have a short survey of how that works so that I can evaluate the similarities/differences myself. I assume that is also the case for others who want to evaluate this RFC too. If they are so different, then that would be enough.

Since I don't know the other mechanisms, I'm making a complete strawman here, but suppose that GDB takes a python file as a visualisation input, then that suggests that the Cargo manifest syntax should be something like

[debug-visualisations]
natvis = [...]
gdb-py = [...]

(again, totally making up the keywords as I go along). In that case, I would suggest this RFC propose this syntax (without the gdb case, leaving that for future work), rather than having natvis as a top-level key in the TOML file.

If there is something more to do here, we are certainly open to that. To me, it appears that the main requirement is not to interfere with implementing support for other debug formats, and I think this proposal meets that requirement.

I would say that not only must we not interfere, we should leave scope for supporting other formats in a clean way.

Because different debugger visualizers are so different in how they work, I don't see much of an opportunity to "use a single mechanism or very similar mechanism". Again, I'm open to that, but I just don't see any reusability here.

I'm not sure if it makes sense, but I hope the hypothetical example above makes clear the extent of reusability I'm imagining, it's not much of an iteration over the current state of the RFC.

Yes, that's pretty much all there is to it. The only complexity here lies in the requirement to propagate these filenames across the dependency graph. That is, when you invoke the linker, you need to pass the linker options for all crates in the dependency graph, not just the current crate.

That sounds fine. I think it would be nice to specify this in the RFC in a bit of detail so that we can evaluate the scope of changes being proposed

We also favor auto-discovery. However, like many options in Cargo, I think the developer should still be able to override auto-discovery.

Do you have an example of something you are following with this design? My intuition is that natvis files should be treated like source files, for which there is no mechanism to list them (not sure about overriding auto-discovery for source files, there is a mechanism in Rust source, but I'm not aware of a mechanism in Cargo).

Are there typically many natvis files in a project, or usually just one (or one per target)? My preference would be not to keep metadata in the src directory and thus have a separate directory for such files, but I guess I would feel less strongly if it were just one file. I'm not sure of any precedent for keeping metadata in src vs in a separate directory, but it would be good to know about any if there is any.

@ridwanabdillahi
Copy link
Contributor Author

In that case, I would suggest this RFC propose this syntax (without the gdb case, leaving that for future work), rather than having natvis as a top-level key in the TOML file.

This RFC initially suggested having the natvis key under the [package] section of the TOML file, which was one of the unresolved questions I had. Creating a new section for debugger visualizations sounds fair to me and would probably be more inclusive of future work to support other formats.

That sounds fine. I think it would be nice to specify this in the RFC in a bit of detail so that we can evaluate the scope of changes being proposed

I've explained at a fairly high level in the reference level explanation the scope of changes this RFC introduces in terms of new toml syntax and rustc flags as well as needing to make changes to the crate metadata. I will go into more detail in this section to show what specific changes need to be made and what fields need to be added to a CrateMetadata object.

Are there typically many natvis files in a project, or usually just one (or one per target)?

There is typically one Natvis file per library i.e. one file for a static lib and a separate for a shared DLL. This helps to break down where Natvis visualizations are defined for types. In the case of a Rust crate, you could imagine each crate having a single Natvis file that defines the debugger visualizations for types within that crate.

I'm not sure of any precedent for keeping metadata in src vs in a separate directory, but it would be good to know about any if there is any.

There is no reason as to why this would need to be kept in the src directory. Natvis files can be stored anywhere but they must be published as part of the crate. This way other crates that consume said crate would be able to pass the .natvis files into the linker.

@sivadeilra
Copy link
Contributor

@nrc I like your suggested TOML structure, of:

[debug-visualisations]
natvis = [...]
gdb-py = [...]

About the question of where to put the files, and how many files there are. For typical C++ libraries, there is often just a single NatVis file, but it certainly does happen that a library is large enough to warrant breaking NatVis into more than one file, for the benefit of developers organizing things. So my expectation is that your typical crate will have just one, or a small number, of NatVis files.

When I wrote the first draft of this spec, I thought just having a ${crate_root}/dbg/natvis/*.natvis rule would be sufficient. However, if we want to allow a crate to have different sets of NatVis files for each unit that it produces (primary library, binaries, integration test binaries, etc.), then I think placing the NatVis files into the src tree makes sense. It also makes sense just for keeping the NatVis visualizations close to the module trees that they describe.

Or, we could simply say that the scenario of having different NatVis for different units is overkill, and that the same set of NatVis files should be used for all units in a given crate. That way, we just put them into ${crate_root}/dbg/natvis/*.natvis and be done with it. I like that approach, because it is simple and easy to remember and easy to implement, and because I can't actually think of a scenario where I would want different NatVis in different units.

Hi, so I'm personally not familiar with how any debuggers support visualisations, so it would be useful for me to have a short survey of how that works so that I can evaluate the similarities/differences myself. I assume that is also the case for others who want to evaluate this RFC too. If they are so different, then that would be enough.

I think that's a reasonable request, mainly because it will illustrate just how different they are. For GDB, it's a set of Python scripts (I think? might be wrong). For others (WinDbg), it's a native, compiled DLL that gets loaded into a debugger. WinDbg also supports JavaScript-based extension, too. Some of these can be linked into target binaries, or into their debug symbols (PDBs), but others have no convenient way to package the debugger extensions in a way that is discoverable by the debugger -- you just have to know how to find it.

@nrc
Copy link
Member

nrc commented Nov 18, 2021

However, if we want to allow a crate to have different sets of NatVis files for each unit that it produces (primary library, binaries, integration test binaries, etc.), ... and because I can't actually think of a scenario where I would want different NatVis in different units.

It sounds like it is worth digging in to this requirement. Is there an equivalent thing for C++/VS projects?

I know some projects have extensive support for testing infrastructure, and that might warrant separate debug metadata. However, one could also use a separate crate in workspace for this. I can't imagine want separate visualisations for most tests?

I can imagine a crate has both a library and binary targets and the latter has some data types (with visualisations) which are not in the library. Could this be handled with a single natvis file? Relatedly, if a crate has datatypes which are present or not depending on cfgs (e.g., Tokio), how would this be handled with natvis and this proposal?

@sivadeilra
Copy link
Contributor

Is there an equivalent thing for C++/VS projects?

No, because in VS a single project produces a single output (EXE, DLL, LIB). There's no concept of "units".

On balance, I think we should start with the simpler model: For a given crate, the user defines a set of NatVis files. Those NatVis files are compiled into the primary crate unit (exe or rlib), and into the unit test crate unit. For benchmarks, integration tests, and bins, we don't include the NatVis files, because those compilations already reference the primary crate, which already contains the NatVis files.

I think this is best because:

  • It's simple for the user to reason about.
  • It still allows sophisticated developers to include definitions that are specific to their testing code. Yes, those definitions will be placed in the PDB file for the primary crate, which does mean that there's some test-only information in a non-test PDB. But I think that's OK -- it doesn't affect the primary crate itself, only its debugging info. It will be harmless, and will be minuscule in size, compared to the PDB itself.
  • It lets us use the simpler filesystem layout, which will be easy for developers to learn and use.

Relatedly, if a crate has datatypes which are present or not depending on cfgs (e.g., Tokio), how would this be handled with natvis and this proposal?

I think, for this V1, the NatVis file would contain definitions for the union of all features. As long as features are additive (and they are supposed to be), that should work fine. It means that some definitions may be ignored or useless when their corresponding features are not activated. I think that's OK.

When I originally drafted this RFC, I included two means to package NatVis into crates. The first is the simple "whole file" method, which is what is described here. The second was to allow users to include NatVis fragments directly into Rust source code, such as:

natvis!(r#"
    <Type Name="my_crate::MyFancyType">
      <DisplayString> ... </DisplayString>
      <Expand> ... </Expand>
      ...
    </Type>
"#);

Or, to provide it as an attribute on a type, so that the compiler could generate exactly the right type string, for matching descriptions to mangled types. This would free the developer from needing to write the <Type Name="..."> element. (Also, NatVis supports generic types, to an extent. Writing the <Type> parameters for generics is a little more tedious, so handling that for the developer would be a win.) Example:

#[natvis(xml = r#"
  <DisplayString> ... </DisplayString>
  <Expand> ... </Expand>
")]
pub struct MyFancyType { ... }

Rustc would then find all of these and assemble them into a single XML document, and bundle that into the PDB.

If we allow users to embed NatVis fragments directly into source code, then we can allow #[cfg] to control them, or to construct them using proc macros or macro_rules. This has lots of advantages, but we wanted to start with the simplest V1 that we could, because we felt that it would take a much longer time to come to consensus on what the V2 would look like.

So our plan is, stabilize the V1 support, and then to submit PRs to various crates, so that we can improve the debug experience of those crates. Then, when the value of this becomes apparent, we (with the community) can concurrently be working on spec'ing V2, with direct source embedding.

text/0000-natvis.md Outdated Show resolved Hide resolved
@joshtriplett
Copy link
Member

I posted one comment, but otherwise this looks good to me.

I think it's completely reasonable to do this in two phases, with the first requiring separate files and the second allowing inline attributes that get extracted.

@ridwanabdillahi
Copy link
Contributor Author

Hi @joshtriplett thanks for taking the time to review this RFC. The goal of this RFC is to see stabilization of this feature at some point in time. As such, I was wondering if the flags I have suggested in the RFC are appropriate. There's a couple of ways this could be achieved.

  1. As described in the RFC, keep the -Z natvis={comma-separated list} flag at which point it would eventually get turned into a -C flag for stabilization.
  2. Create a -C natvis={comma-separated list} flag with a -Z enable-natvis flag as well which would guard against using the -C natvis flag in stable Rust.

Thoughts?

@joshtriplett
Copy link
Member

@ridwanabdillahi I personally would propose starting with -C natvis, and gating it with the existing -Z unstable-options; that would make a future stabilization easier. But there may be a reason I'm not aware of for why that isn't a good idea.

@ridwanabdillahi
Copy link
Contributor Author

I personally would propose starting with -C natvis, and gating it with the existing -Z unstable-options;

@joshtriplett that sounds good to me. I'll update the RFC to reflect as such.

@ridwanabdillahi
Copy link
Contributor Author

I believe I've responded to all of the comments, is there anything else that needs to be addressed in this RFC?

Copy link
Member

@nrc nrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks for addressing all my earlier comments. I have a few more inline, but nothing huge.

The one 'big' thing is that I still think the "Auto-discover Natvis XML files" option is more idiomatic for Cargo then explicitly listing all the natvis files, but we can leave that up to Cargo team review.

text/0000-natvis.md Outdated Show resolved Hide resolved
text/0000-natvis.md Outdated Show resolved Hide resolved
text/0000-natvis.md Outdated Show resolved Hide resolved
text/0000-natvis.md Outdated Show resolved Hide resolved
text/0000-natvis.md Outdated Show resolved Hide resolved
@nrc
Copy link
Member

nrc commented Dec 9, 2021

In response to #3191 (comment) (and just general pondering). I wonder if it is worth thinking a bit about integration with Cargo features at least (and perhaps other sources of cfg input such as the target platform). I don't want to get too much into discussion of what v2 would look like, but I have a strong preference to avoid inlining the natvis info and keeping it in separate files. Given this possibility, I want to make sure there is scope in the proposal to extend it to including different natvis depending on features, etc. We don't have to go into too much detail, obviously.

Could we imagine extending the Cargo manifest entry to include features, etc. in the same way as dependencies? I think it should be easy enough some how, and perhaps this is a good justification for keeping the explicit filename list rather than auto-discovery?

@nrc
Copy link
Member

nrc commented Dec 9, 2021

One more thought, on toolchains which don't support adding natvis into the PDB during linking, should Cargo or rustc emit all the specified natvis files into a location in the target directory so that they can be easily accessed by the user?

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request May 28, 2022
…ichaelwoerister

Add support for embedding pretty printers via `#[debugger_visualizer]` attribute

Initial support for [RFC 3191](rust-lang/rfcs#3191) in PR rust-lang#91779 was scoped to supporting embedding NatVis files using a new attribute. This PR implements the pretty printer support as stated in the RFC mentioned above.

This change includes embedding pretty printers in the `.debug_gdb_scripts` just as the pretty printers for rustc are embedded today. Also added additional tests for embedded pretty printers. Additionally cleaned up error checking so all error checking is done up front regardless of the current target.

RFC: rust-lang/rfcs#3191
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request May 28, 2022
…ichaelwoerister

Add support for embedding pretty printers via `#[debugger_visualizer]` attribute

Initial support for [RFC 3191](rust-lang/rfcs#3191) in PR rust-lang#91779 was scoped to supporting embedding NatVis files using a new attribute. This PR implements the pretty printer support as stated in the RFC mentioned above.

This change includes embedding pretty printers in the `.debug_gdb_scripts` just as the pretty printers for rustc are embedded today. Also added additional tests for embedded pretty printers. Additionally cleaned up error checking so all error checking is done up front regardless of the current target.

RFC: rust-lang/rfcs#3191
@BurntSushi
Copy link
Member

I've just become aware of this RFC and skimmed at least all of comments here. One thing that I don't see discussed in the thread (although it is mentioned in the RFC text itself) is why, specifically, we can't use Debug impls for this. The RFC says this:

Alternative 4: miri executes the MIR of a Debug impl within a debugger

Supporting this option would mean that changes to cargo and rustc are not necessary. This would have the added benefit of taking full advantage of existing implementations of the Debug trait. Many Rust developers already implement the Debug trait which is used to format how types should be viewed, this would only ease the debugging quality of Rust when viewed under any debugger. This option also has the added benefit of not requiring any changes to a crate from a Rust developer by leveraging existing Debug impls.

The drawbacks for this option is that this has not been fully investigated to determine its viability. This could be a great potential feature to ease debugging Rust but without concrete data to push this towards a potential RFC, I would assume supporting debugging in the systems that are already heavily used by the Rust community to be a higher priority. If/when this option becomes a bit more viable, there would be nothing stopping it from becoming a true feature.

The idea of just using Debug impls seems like a massive win, right? So I must be missing something. Instead of having to explicitly maintain separate debug files, you instead get nice pretty printed displays automatically for virtually every type. The motivating example in the RFC is particularly relevant here. Consider this program:

use regex::Regex;

fn find_c_defines(input: &str) {
    let rx = Regex::new(r#"^#define\s+(\w+)\s+([0-9]+)\s*(//(.*))?"#).unwrap();
    for captures in rx.captures_iter(input) {
        println!("{:?}", captures);
    }
}

fn main() {
    find_c_defines("#define SOME_CONSTANT 42");
}

It outputs:

Captures({0: Some("#define SOME_CONSTANT 42"), 1: Some("SOME_CONSTANT"), 2: Some("42"), 3: None, 4: None})

And this works because the derived Debug impl for Captures is really not helpful at all, as explained in the RFC. So it has its own custom Debug impl to make the output much more useful. Having to redo these sorts of hand-tuned Debug impls seems pretty unfortunate and a hard sell for me personally.

Looking at the natvis example in the RFC:

<?xml version="1.0" encoding="utf-8"?>
<AutoVisualizer xmlns="http://schemas.microsoft.com/vstudio/debugger/natvis/2010">
    <Type Name="foo::FancyRect">
      <DisplayString>({x},{y}) + ({dx}, {dy})</DisplayString>
      <Expand>
        <Synthetic Name="LowerLeft">
          <DisplayString>({x}, {y})</DisplayString>
        </Synthetic>
        <Synthetic Name="UpperLeft">
          <DisplayString>({x}, {y + dy})</DisplayString>
        </Synthetic>
        <Synthetic Name="UpperRight">
          <DisplayString>({x + dx}, {y + dy})</DisplayString>
        </Synthetic>
        <Synthetic Name="LowerRight">
          <DisplayString>({x + dx}, {y})</DisplayString>
        </Synthetic>
      </Expand>
    </Type>
</AutoVisualizer>

This looks like some kind of interpreter is needed for this anyway? How does this work? And why can't Debug be called instead? Moreover, can someone show me the natvis file that will make the debugger visualization nicer for the Captures type?

So I guess my question boils down to: is supporting Debug impls here "just" a matter of implementation work? Or is there a semantic mismatch behind what you want to see in a debugger versus what a Debug impl gives you?

@sivadeilra
Copy link
Contributor

We've actually prototyped the MIR interpreter approach, all the way through calling Debug imple. It's compelling, it's a great experience. But it's several orders of magnitude more complexity to implement and deploy. There are some nontrivial things to figure out, such as how do you store the MIR after compilation, how do you interpret it against the state of a running program, or even against a crash dump. How do you manage the problem of multiple compiler versions, etc.

We'd like to do that work, and do it the right way (open collaboration, RFC, etc.). But meanwhile, in parallel, we felt that NatVis could provide a lot of value, relatively quickly, and with very low complexity cost and low resource cost.

We can view them as complementary, overlapping approaches. We can get immediate benefits from NatVis, while working toward the interpreter approach.

The interpreter approach also enables some really wild capabilities, such as writing debug extensions directly in source code. Not just Debug impls, but arbitrary debugger commands / extensions, using a dyn trait obj that represents the hosting debugger environment.

@wesleywiser
Copy link
Member

wesleywiser commented Jun 7, 2022

The RFC lays out a few different approaches to the "debugger visualizer" problem but I think they can be mostly summed up in three categories:

  1. Use the existing functionality of the platform's debugger

    • Pros:
      • Using the native debugger functionality will probably offer a better UX for the user because it's tied into the debugger itself. As an example, using natvis in WinDbg means nested items are represented in the GUI as expandable/collapsible elements.
      • One of the selling points of Rust IMO is that we try hard to integrate with the underlying system and not do something opaque on top of it. A lot of higher-level languages aren't really debugable using WinDbg/gdb/lldb and you have to use whatever debugger ships with your language, which is often missing features in comparison to the platform's debugger.
      • We don't have to implement any kind of debugger plugin to make this work. To answer your last question here: the NatVis interpreter is embedded into Windows debuggers. The equivalent functionality is embedded into gdb/lldb.
    • Cons:
      • You have to implement debug formatting multiple times and keep them in sync. This is a significant downside but there are ways to mitigate many of the problems here. These aren't currently implemented but there are multiple, reasonable approaches we can take here which are backwards compatible with this RFC.
  2. Call the <MyStruct as Debug>::fmt method directly

    • Pros:
      • Debug formatting only has to be implemented once!
      • You can write normal Rust code instead of needing to write NatVis/Python expressions to format your types.
    • Cons:
      • As is currently implemented, the debug function may not actually exist in the binary. If it's never called, the compiler can elide generating code for it and in the case of optimized programs, LLVM or the linker may remove it if it's dead code. There are ways around this but in any case, we still have the problem of debug infrastructure being stored in the binary instead of in debuginfo which hurts compile times and binary sizes.
      • The method uses the unstable Rust calling convention which the debugger doesn't necessarily understand, as such it might not be able to invoke it correctly. This could potentially be resolved by teaching the debugger the calling convention somehow or by using a "trampoline" function with a well-specified ABI.
  3. MIR interpreter in the debugger

    • Pros:
      • It's very cool!
      • No impact to binary size since the MIR could be stored somewhere other than the binary.
      • Debug formatting only has to be implemented once!
      • You write normal Rust code.
      • Enables other, more advanced scenarios.
    • Cons:
      • Requires writing a debugger plugin for every platform's debugger we want to support. This is a non-trivial amount of work.
      • MIR is currently unstable (but there are efforts to create a stable-MIR) which means the debugger plugin must link to the same version of the compiler that built the code.

Having said all of that, this RFC being accepted does not mean we close the door to options 2 or 3. For Rust to integrate well with the underlying platform, we need to have a solid implementation of option 1, which this RFC provides. It would be great to go above and beyond that as well!

@BurntSushi
Copy link
Member

I see, okay. I think that makes sense. For context, I didn't even know (2) and (3) were distinct choices. So there is a lot that I don't know here.

For (1), it sounds like what you're saying is that the natvis file works because the debugger has its own interpreter. To what extent does that work exactly? What would a natvis file for a regex::Captures look like? Namely, there are no public Captures fields and its representation is not particularly straight-forward. So you really want to be able to call methods on the Captures type to get a nice debug display. Is that possible?

I think I'm just having a hard time seeing how option (1) materially improves things if folks don't put in the work to maintain these debugging files alongside their crates. Have library authors been consulted about the overall user experience here? (I'm still pretty unclear on it myself. It's hard to even know what's possible exactly.)

To be clear, I see what y'all are saying. And the progression here makes sense. And I also agree that having a way to expose the underlying debugger's functionality is also a good idea. And I love the idea of making the debugging UX better. Maybe I'm just jumping in too early here before the docs for how to effectively use this feature are written. :)

@wesleywiser
Copy link
Member

wesleywiser commented Jun 7, 2022

Those are all fair questions! 🙂

To what extent does that work exactly?

I'm not quite sure what you're asking here but from what I've seen, most debuggers offer some kind of "object model" based on the debug info present for a specific type. Since this is based on debuginfo, field privacy doesn't matter since the evaluator/interpreter goes through the debugger itself rather than the Rust compiler/type system.

What would a natvis file for a regex::Captures look like?

I thought I'd seen one somewhere in the RFC but I can't find it now. I don't know if @ridwanabdillahi has a suggested implementation of this available or not.

So you really want to be able to call methods on the Captures type to get a nice debug display. Is that possible?

In theory yes, but it has the problems I mentioned above. In optimized programs, the symbol might not exist because it was optimized away (even if used, it might be inlined into all the callsites and then the specific symbol is removed). There's also the problem of ABI but that is solvable.

I think I'm just having a hard time seeing how option (1) materially improves things if folks don't put in the work to maintain these debugging files alongside their crates. Have library authors been consulted about the overall user experience here?

I think there are reasonable paths to improve things in this regard. Using the features described in this RFC, we could pretty easily imagine some combination of:

  • a library crate is written which allows you to write some kind of basic debugging expressions which are then turned into NatVis/gdb pretty printers for you. Potentially this also generates your Debug impl so they all stay in sync with each other.
  • a testing framework is built that makes it easy to run visualizer tests in various debuggers. There is a primitive version of this used in rustc for our debuginfo tests but @michaelwoerister has been making progress on an improved version for use in the async crash-dump debugging effort.

Maybe I'm just jumping in too early here before the docs for how to effectively use this feature are written. :)

It's good feedback and we should make sure we incorporate this into the documentation for the feature!

@nalply
Copy link

nalply commented Nov 12, 2022

@wesleywiser wrote in #3191 (comment):

The RFC lays out a few different approaches to the "debugger visualizer" problem but I think they can be mostly summed up in three categories: [,,,]

Perhaps there is a fourth way? I was experimenting with code similar to this:

fn show_debug(dyn_debug: &dyn std::fmt::Debug) {
    println!("{dyn_debug:?}\n");
}

then doing p show_debug(your_value_you_want_to_look_at) in gdb. However this doesn't work with gdb, because gdb seems not to be able to cast to trait objects. While Rust code can do this:

let trait_object: &dyn std::fmt::Debug = &"example value";

you can't do this in gdb: set trait_object = &"example value" fails with Invalid cast..

I tried to work around this limitation of gdb by writing a helper function to do this for me, however I was not successful. I can't use a generic function because it might be not monomorphised for the type of the value I want to look at. I experimented with std::ptr::metadata() (RFC 2580), however I couldn't find a way to bypass the need to cast to a trait object from inside gdb. RFC 2580 doesn't offer the possibility to create a vtable directly from the trait. It was like kicking the can down the road, somewhere the value must be converted to a trait object, and boom Invalid Cast. or even segmentation violations.

I think gdb should be able to cast to a trait object. Then we might have a simple solution for debugger visualisations with Debug!

I propose to go upstream and ask for the feature of casting to trait objects.

I didn't look at Windows debuggers, but I can imagine that they already have this covered.

@DrRuhe
Copy link

DrRuhe commented Jan 20, 2023

An impl Debug serves different semantics than special Visualizations like Natviz files. Using <MyStruct as Debug>::fmt is therefore ideal for the main visualisation of the types. Natviz files are also concerned with the visualization of the fields.

What follows are my ideas/dreams for the far future of rust debugging. Notably, they have yet to be implemented so it's unclear to me what exactly is required from a technical perspective. It also builds upon scenarios 2/3 from #3191 (comment)

Using <MyStruct as Debug>::fmt is great to summarize the information on a given type but gives no instructions as to what fields the debugger should display.

I'd propose adding a new Trait that can specify exactly this behavior. This would allow users to write rust code once that could then be provided to debuggers. When the trait is not implemented it simply uses the default behavior, similar to how Debug impls should be handled then.

This trait would work similarly to Debug. There would be one method to implement and a mutable formatter &mut std::fmt::DebuggerVisualizer is passed to the method. The formatter would expose a builder pattern to define the visualization of the debugger. So methods like these would probably fit nicely there:

  • DebuggerVisualizer::nativeFields(&mut self)->&mut self - adds an entry [native Fields] under which the underlying fields of the type are shown.
  • DebuggerVisualizer::section(&mut self,name:String,description:String,f:F)->&mut self - adds a section with given name and description and populates it with the result from the closure that gets a &mut DebuggerVisualizer.
  • DebuggerVisualizer::field<T>(&mut self,name:String,t:T)->&mut self - adds a field with the type that should populate it. Notably this type has no Debug or DebuggerVisualizer bounds since the Debugger can just display it normally (but use the impls when present).
Some more thoughts about this
  • possibly add some way to let users know which fields are actually present? Is a [native Fields] section enough?
  • rustc should probably only generate one such function that can handle generics by using trampolines internally. We don't want implementations for all Generics to bloat the size.
  • These implementations must be compiled in to be used. Should this be done into a separate binary as to not inflate the debug build binary?

@aganea
Copy link

aganea commented Feb 2, 2024

Hello! FWIW, if that could simplify things (for *-pc-windows-msvc targets at least), we could easily add in LLD parsing of /NATVIS flags in the .drectve section embedded in the .objs/.rlib for that crate. Here: https://github.com/llvm/llvm-project/blob/main/lld/COFF/Driver.cpp#L414 That acts essentially as if the flags were passed on the lld-link ... command-line, but instead they are supplied internally by the .obj file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this RFC. T-compiler Relevant to the compiler team, which will review and decide on the RFC. T-lang Relevant to the language team, which will review and decide on the RFC. to-announce
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.