-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion: Change default DebugType to embedded for .NET Core 3 #2679
Comments
I can confirm the slow symbol server load is super painful, I have a related dev community issue open on it. If someone is on latest previews which means fewer cache hits, it's even worse. I'm not saying optimize for this, only that the case exists and we hit it often. |
IMHO, we should not change the default. It's basically breaking to add 20-30% size. This is not just on disk, but can also impact (virtual) memory usage. The "very least" option is something we can consider. But I'm not sure about how it would pull in source link. How would we know which provider to use? And even if we did, how would we determine the package version? |
For sourcelink, you can add all of them and it'll use the right one. In fact it's the recommended approach for dealing with multiple hosts. Sourcelink could be bundled in the SDK like NuGet, and the others. I think the idea of an opt-in |
I want to add another angle here: I maintain an exceptions logging library, we deal with exceptions, and we get issues reported from exceptions. Stack traces are golden. They're incredibly useful. I strongly believe default experience for any app should have stack traces include the file paths and line numbers. For this to happen the deployment platform must be able to read the symbols, and they have to be there. With SourceLink, it even means the exceptions have the git hash in the URL path, pointing directly to the exact code throwing. That's awesome. But only if we get it. Without users having symbols in the first place it's all wasted effort. That rich experience should be the default. If size is a concern, I propose we focus on stripping what users don't want in a publish phase rather than having options to the detriment of many up front. If someone is concerned about mobile or container sizes, what are the chance we could allow a slimming phase to happen as Xamarin can today? I don't see how we can make deployment choices for everyone up at the package level. That's a deployment concern and a deployment preference. Why is someone's small footprint deployment priority governing by debugging and error logging experience? The only way to work around this as a package author (as far as I'm aware) is to force the PDB files in the package or to switch to embedded symbols. But both only fulfill me using my own libraries and take the control out of everyone who's trying to do that small deployment and is okay with sacrificing that functionality. For this reason, I really suggest focus be on splitting deployment concerns into another phase than "how packages are hosted". There's a related issue up there where embedded seems like the best long-term approach, but last we tested .NET Full Framework wouldn't read the symbols and render the data in stack traces. That's something that needs fixing if not already done in the .NET 4.x line. |
How much overhead does that add to build for the ones I'm not using.
Where is this recommended? I can see discussion of supporting mixed submodules, but nothing that says just ref them all. |
I would vote for external PDB-Files by default with sourcelink as an optional feature. NuPKGs should contain PDB-Files that can be filtered with the known asset metadata in packagereferences. <ItemGroup>
<PackageReference Include="Contoso.Utility.UsefulStuff" Version="3.6.0">
<ExcludeAssets>Symbols</ExcludeAssets>
</PackageReference>
</ItemGroup> Maybe it'd be possible to make it Debug/Release Config dependent whether or not symbols are included. |
Not much really. Sourcelink tries to retrieve git remote origin and puts that url through a parser and sees what fits. Getting the current head is also not much effort. |
In addition to the concern around size increase being essentially breaking, anything that discloses new source information in binary is absolutely breaking. |
I mean breaking if defaults change. |
Opt-in via a single property then, solves the defaults issue. |
This is status quo for those, yes? |
Please don't call it "ConfigureOpenSourceDefaults", though. Employers of the world will lose their collective s****, if i start "enabling open source" just to link PDB with private repositories. |
Yes, and that's what I'm trying to change, at least with a single property to opt-in to a better stance for OSS libraries. |
I'm sure we can come up with a name. I think the key point here is that there are different security needs for closed source vs open source projects. The SDK can have different default guidance and behavior for each based on a single configuration property. |
So are we talking just the fact that the PDB-File isn't inside the assembly file instead in a seperate file in the same directory is enough for it to significantly slow down debug startup? I feel like this is vital to the discussion and the original post wasn't clear enough about this for me to understand. |
My hot takes: Source Link on by default: No. We are looking at this from the perspective of developers who build open source code. Meanwhile the majority of .NET apps are close source. PDBs embedded in DLL: I like embedding the PDB in the NuGet package alongside the DLL more. It gives the developer the option to not deploy PDBs when publishing then bin directory. I think the defaults today are fine. There are improvements I'd like to see:
|
@JamesNK what would be the benefits of having sourcelink in the SDK, if it isn't on by default?
Why put stuff inside Microsoft.Net.Sdk, if this forces you to put Sourcelink under .NET Sdk release cycle, when you gain nothing from it usability-wise? EDIT: Even worse: You put a formerly nuget restored msbuild sdk, that can referenced at will into an Sdk, which has to be installed manually... Scenario bound to happen: "I want to use sourcelink with fancy new hosted git provider, but it's bundled into the Microsoft.Net.Sdk now, so i now have to wait until Appveyor has .NET Sdk 3.3.700 images..." |
I mean that configuration to enable Source Link easily with a property would be in the SDK. Disclaimer: I'm not an SDK expert. |
The process of retrieving PDB files from symbol server is very slow today. I'm not aware of any noticeable debugger related perf issues between embedded pdbs and pdbs next to the file. |
I guess I'm aiming to introduce a notion of "actionable best practices" that are easy to enable. Different audiences will have different profiles, but I could imagine that there's one set of settings that are oriented around OSS and another that's for internal. |
Well there's two main ways of distributing Sdk packages currently:
For sourcelink to work you need the sourcelink msbuild task binaries, which do the job of resolving current head, remote urls and the matching raw source file endpoints. (Correct me if i'm wrong about stuff, Sdk overloads 😄 ) |
@MeikTranel depending on the mechanism used, it could be possible to override the SDK's version of the tools with an updated version. Also, the SDK does seem to ship updates on a regular cadence, so it'd get picked up that way too. |
@onovotny yeah global.json is your friend here, but pinning versions for Microsoft.Net.Sdk doesn't magically install it. for anyone that doesn't use hosted build agents this means that any requirement to the sdk has to revolve around asking Ops teams to update build images everytime a useful tweak to the sdk comes along. |
It would be ironic if just when we can provide 2rd party nuget source server support we decide to turn it off by default. However changing the default so that you use snupkg (rather than symbols.nupkg (you seem to need to set the variable today).
We should at least document the experience as it is today, as well as how we want the default to be, so that they can be compared properly before making a decision. Finally I note that the issue with delays in visual Studio (or any other debugger), being slow looking up symbols from symbol servers is really a bug in the debugger. Visual studio definally caches the result and is fast (does not hit the network, but can be slow if there are DLL that it COULDN'T find. It really should cache this negative result return quickly in that case as well. Anyway, we should not let bugs in other components drive the design/defaults of roughly unrelated things. We shoudl instead log an issue to get things fixed properly. |
cc @rrelyea |
Embedded pdb doesn't show full callstack info in net framework < 4.7.2, it's a only a problem. For netcore I think we can use embedded as default. It's really nice proposal. |
IMO new templates on v3 should include the msbuild property to enable all the ins-and-outs of this feature. v2 projects could add the property to opt-in to the behavior without an upgrade. |
The DLL size increase is not large, platforms where this package would be deployed can support it, and it improves debuggability by default. See discussion at dotnet/docs#9110 (comment) and dotnet/sdk#2679 (comment).
I think we should create a separate issue for #2679 (comment) and close this one. |
given netcore3 has sailed. should this be renamed? or closed? |
@nguerrera can u point tot the doco that explains how embedded symbols impact impact virtual memory usage? my understanding was the runtime was smart enough to ignore that part of the assembly until it was needed? |
….1 (#2679) - Microsoft.DotNet.Cli.Runtime - 5.0.100-alpha1.19457.1
* Re-added SourceLink support Getting this to work was quite a challenge. I started out with the minimal example described at https://blog.jetbrains.com/dotnet/2020/02/03/sourcelink-consuming-apis-nuget-dependent-code/ and the guidance at https://github.com/dotnet/sourcelink/. This creates a .snupkg (containing symbols) next to the .nupkg, both of which need to be uploaded to nuget.org. At the moment, nuget.org is the only package registry that supports this. It does not work for local feeds, AppVeyor or any other registry (https://stackoverflow.com/questions/54211644/publish-snupkg-symbol-package-to-private-feed-in-vsts). And requires an internet connection during debug, as there is currently no command to fetch all symbols upfront. After uploading to NuGet.org, stepping into sources did not work on my machine (VS Debug settings: Enable Source Link support checked and Enable Just My Code unchecked). Then I enabled the NuGet.org Symbol Server, which started downloading symbols for all .NET Core packages during my debug session, which took forever so I aborted. It turned out that in the past, SourceLink worked by including the PDB in .nupkg, but [guidance was changed](dotnet/sourcelink#255) to use the nuget.org symbol server instead. I installed the sourcelink global tool (https://www.meziantou.net/how-to-debug-nuget-packages-using-sourcelink.htm) and ran it with the `test` switch, which failed and reported that AssemblyInfo.cs could not be found. This sounds reasonable, because the file is auto-generated by the .NET Core build chain. As discussed [here](dotnet/sdk#2679), symbol server support is not working very well for numerous reasons. So I reverted to [including the PDB](https://www.hanselman.com/blog/ExploringNETCoresSourceLinkSteppingIntoTheSourceCodeOfNuGetPackagesYouDontOwn.aspx) instead of using symbol server (using `$(AllowedOutputExtensionsInPackageBuildOutputFolder);.pdb</AllowedOutputExtensionsInPackageBuildOutputFolder>`) and setup a local nuget feed. This worked, but only because the DLL and PDB contained hardcoded paths to my sources on disk. There was no prompt to download sources from GitHub. Renaming the JADNC directory or adding `<DeterministicSourcePaths>true</DeterministicSourcePaths>` (which changes paths like `c:\source\repos\JsonApiDotNetCore\...` to `\_\JsonApiDotNetCore\...`) broke the experience. I had to manually browse to the PDB file on disk (via Modules, Load Symbols) to be able to step into sources again. The only way I got this working without intervention was to embed the PDB inside the DLL, using `<DebugType>embedded</DebugType>`. Note you need to push first, so that the commit hash stored in .nupkg exists on GitHub. And have Enable Just My Code unchecked in VS Debug settings. Other changes in this PR: - Fixed package tags (must be single words, because tags are space-delimited in .nuspec) - Added description (shown in VS Package Manager UI) - Removed RepositoryType and RepositoryUrl, so they are calculated (and use the URL of my fork) * Removed unneeded folder in project file * Empty commit to restart TravisCI * Empty commit to restart TravisCI
I have a bit of a small somewhat outside of scope question here regarding embedded pdbs. I like the benefits of embedding the pdb, but in CI (e.g. Azure Pipeline PublishSymbols@2 task) we are also using source indexing which requires (as far as I can tell) the pdb files to be available as separate files, which they are not when embedding, how do you solve this then? |
Isn't source indexing (with pdbstr.exe from Debugging Tools for Windows) of managed code pretty much obsoleted by Source Link? |
@KalleOlaviNiemitalo that may very well be. Source link requires coupling your project to a specific package for a specific cloud provider though, doesn't it? And we also publish indexing to a network drive. |
DebugType=embedded vs .snupkgThe NuGet docs promote the use of
However, they also suggest VS needs updating with the correct symbol server URL. Why can't VS be preconfigured to do this for us? I'd hate to add 30% to the Better DefaultsDotNet.ReproducibleBuilds is a fairly new (not very well publicised yet) package that sets several defaults for NuGet packages, including source link and Another approach might be to introduce a new SDK that does all of the above and a new project template <Project Sdk="Microsoft.NET.Sdk.NuGet">
</Project> |
Given symbols package are not really a thing outside of nuget.org, as most (all?) the private feeds don't support them, I think this should be reconsidered again. While it has minor disadvantages to use embedded symbols for public packages on nuget.org (I believe there use to be a .NET design document expanding on this downsides but can't find it anymore), I believe changing the default would not have a major impact as proper documentation can help to clarify the recommendation which I am sure will be followed by the popular packages which are the ones that could have impact in the ecosystem. When using private feeds it is currently not easy to understand how to properly setup packages to have debug symbols as well as to get SourceLink to fully work. There has been major development by the VS team for SourceLink recently (navigate to definition yay!) but I think what's missing to make it really mainstream is the ease of setup - it should "just work" out of the box. Since .NET 3.0 it is no longer possible to include tl;dr;
we should have a default that is "good enough" for all scenarios |
* Re-added SourceLink support Getting this to work was quite a challenge. I started out with the minimal example described at https://blog.jetbrains.com/dotnet/2020/02/03/sourcelink-consuming-apis-nuget-dependent-code/ and the guidance at https://github.com/dotnet/sourcelink/. This creates a .snupkg (containing symbols) next to the .nupkg, both of which need to be uploaded to nuget.org. At the moment, nuget.org is the only package registry that supports this. It does not work for local feeds, AppVeyor or any other registry (https://stackoverflow.com/questions/54211644/publish-snupkg-symbol-package-to-private-feed-in-vsts). And requires an internet connection during debug, as there is currently no command to fetch all symbols upfront. After uploading to NuGet.org, stepping into sources did not work on my machine (VS Debug settings: Enable Source Link support checked and Enable Just My Code unchecked). Then I enabled the NuGet.org Symbol Server, which started downloading symbols for all .NET Core packages during my debug session, which took forever so I aborted. It turned out that in the past, SourceLink worked by including the PDB in .nupkg, but [guidance was changed](dotnet/sourcelink#255) to use the nuget.org symbol server instead. I installed the sourcelink global tool (https://www.meziantou.net/how-to-debug-nuget-packages-using-sourcelink.htm) and ran it with the `test` switch, which failed and reported that AssemblyInfo.cs could not be found. This sounds reasonable, because the file is auto-generated by the .NET Core build chain. As discussed [here](dotnet/sdk#2679), symbol server support is not working very well for numerous reasons. So I reverted to [including the PDB](https://www.hanselman.com/blog/ExploringNETCoresSourceLinkSteppingIntoTheSourceCodeOfNuGetPackagesYouDontOwn.aspx) instead of using symbol server (using `$(AllowedOutputExtensionsInPackageBuildOutputFolder);.pdb</AllowedOutputExtensionsInPackageBuildOutputFolder>`) and setup a local nuget feed. This worked, but only because the DLL and PDB contained hardcoded paths to my sources on disk. There was no prompt to download sources from GitHub. Renaming the JADNC directory or adding `<DeterministicSourcePaths>true</DeterministicSourcePaths>` (which changes paths like `c:\source\repos\JsonApiDotNetCore\...` to `\_\JsonApiDotNetCore\...`) broke the experience. I had to manually browse to the PDB file on disk (via Modules, Load Symbols) to be able to step into sources again. The only way I got this working without intervention was to embed the PDB inside the DLL, using `<DebugType>embedded</DebugType>`. Note you need to push first, so that the commit hash stored in .nupkg exists on GitHub. And have Enable Just My Code unchecked in VS Debug settings. Other changes in this PR: - Fixed package tags (must be single words, because tags are space-delimited in .nuspec) - Added description (shown in VS Package Manager UI) - Removed RepositoryType and RepositoryUrl, so they are calculated (and use the URL of my fork) * Removed unneeded folder in project file * Empty commit to restart TravisCI * Empty commit to restart TravisCI
I know, the discussion is already pretty old but as it is still open and as I was wondering about it .. :) I don't see a benefit for changing the DebugType from "portable" to "embedded" in any case. Symbols and sources are only needed with diagnostics tools, like Debuggers or for stack traces on crashes . Means, they are required only for a specific purpose in case of a concrete "Issues". For most of the time that an application is working, they are not needed and for this reason, symbols should not be embedded at all.
So, all in all, I only see drawbacks by embedding symbols. |
Let's say you have a web server running in production, you do want symbols here. It used to be a common myth that you didn't want symbols for production, but this was mostly confusion. While you typically don't want to output the full stacktrace to users, this is a different thing from if you want to ship symbols. Symbols will give you better stacktraces, which you'll be collecting privately. Let's say you don't want symbols in for example a command line tool you are distributing, this is a publishing concern. You'd want to say "strip symbols please", and remember, symbols can be embedded by other library authors, so how else would you opt-out? Embedding symbols reduce friction at development time, and for enterprises make things easier as they don't need to think about "does my privately hosted commercial nuget server support snupkgs", everything will "just work". Separating these into 2 packages just makes the acquisition lazy, which I get can save bandwidth over time, but is a really odd tradeoff IMO. Overall, there's no technical merit in using snupkgs over embedded symbols. |
side note: u can opt in to trimming symbols if you really dont want them https://learn.microsoft.com/en-us/dotnet/core/deploying/trimming/trimming-options?pivots=dotnet-8-0#remove-symbols |
The discussion is not about "if you want or not want symbols". I think, it is totally clear why symbols are needed.
IMO, it is the contrary (also revisit my arguments from above). If you want to have a private NuGet package server, you have to set it up anyway and enabling symbol package support is very little effort. If you do not need a private package server, you can also make your symbol packages publicly available, e.g. on nuget.org. For pushing snupkgs to the server from some client, e.g. a build agent, there is not a difference at all. If there is a symbol package next to the NuGet package, it will be automatically pushed to the server, too. |
I was addressing your opening remark which says that symbols are useful for diagnostic tools. I think this might be a critical part of the discussion, because it's easy to draw different conclusions depending on what you think we should be optimizing for. My stance is that symbols are the norm, you want them at development, you want them at test, you want them in production. It's the exception to remove them. With this as a starting point, it's not logical to have them separate. Yes you can, but it is adding unnecessary moving parts that only benefit folks who have a scenario where they want to strip symbols. This is odd to me. |
@slang25 okay, I do understand now what you wanted to point out.
Yes, on a crash/unhandled exception (and only then) u either have to be prepared to automatically generate minidumps for being able to make immediate sense of a stack trace or you lose the convenience to get all valuable information from the stack at the time of the crash. Still, you can diagnose all libs from production afterwards by using a symbol server, i.e. during analyzing the minidump or by reproducing the error with the very same libs + symbols. There is also no difference with
As it makes no difference from usage point of view, if symbols are embedded or automatically taken once from a symbol server when required, Imo it is preferable to not embed symbols by default as you do not lose much convenience but instead can achieve some gain. However, I also see scenarios where you want the stack trace information immediately from production and you don't care about some little extra resource usage of a single application. In this case, embedding symbols can be more convenient. What should be the default behavior seems to me, now, highly subjective :) |
I'm very bias here as I work for Sentry. But I was in favor of embedded by default for the longest time, and now I'm not. Here's why: In development, IDEs already do a great job at fetching debug files when needed. We all agreed there are reasons not to include the debug data directly in DLL, like disk space, downloading extra bytes, serverless cold start (download size again) etc mentioned above. But it turns out all you need is to make sure your error/crash reports include the metadata needed in order to fetch symbols after the fact. And reconstruct the stack trace with line numbers.
At Sentry you can either upload your PDBs, or Sentry will fetch them from a symbol server (your own, or some built-in ones. Including nuget.org). So your app doesn't need to have the PDBs in production and you still get good looking stack traces. If In summary now I think we should just default to spit out DLL + PDB and make sure we publish nupkg + snupkg. And hook up a tool that gives gives you the context you need to figure stuff out. |
Inspired from this issue and twitter thread
dotnet/docs#9110
https://twitter.com/rrelyea/status/1064587440935956482
I want to propose that the default
DebugType
in the SDK change fromportable
toembedded
. I would also propose that Sourcelink be included/used by default.At the very least, there should be one property
ConfigureOpenSourceDefaults
that turns these on as best practices.My rationale is driven by the pit of success. I believe that out-of-the-box, developers should produce fully debuggable code. This is similar to how JS typically generates/includes sourcemaps by default, so there is precedent.
Few notes:
Some things like Appx/MSIX packaging strip-out pdb files by default. They rely on extra properties (
AppxPackageIncludePrivateSymbols
) to include them, which I'm sure 99% of people wouldn't know to add. Without them, the stack traces aren't as useful.Symbol servers may be blocked by corporate firewalls, or otherwise affected by low-bandwidth connections (think airplanes, trains, mobile hotspots, and anywhere else with slower/spotty connections).
I believe that the extra size of the files with embedded pdb's is worth the trade-off for most cases as it presents the best debugging experience without any extra configuration. It just works.
Thanks
The text was updated successfully, but these errors were encountered: