-
Notifications
You must be signed in to change notification settings - Fork 508
Linux stack walking performance #3784
Comments
@jkotas, @janvorli, are there future plans for moving CoreCLR to also consume a light-weight cross platform unwinder, in effort to waive the libunwid dependency? I found a related issue dotnet/coreclr#872, but as far as I can tell, LLVM libunwind is already supported in cmake script as a (degraded) fallback to the primary HP libunwind. I am not quite sure, but both of these libraries might be too heavy for CoreCLR use-cases as well. |
We do not have plans like that currently. CoreCLR has its own more lightweight unwinder already. CoreCLR is using libuwind to unwind from manually managed code only. It is rare case, so it does not matter much that it is relatively slow. Unwinding from manually managed code needs to support all unwind codes that may be potentially generated by the C/C++ compiler so limiting it to a subset of unwind codes to make it more lightweight is not very viable. |
@jkotas, thanks for your explanation. :) Now that mono and coreclr co-exist in the same repository, is it viable to use the same mechanism which mono uses for manually managed code, as a fallback to libunwind; for platforms which do not have libuwind readily available (Solaris, QNX and so on)? I tried to port parts of libunwind to Solaris last year which is now available in very recent release candidate 1.5-rc1. However, after the successful compilation, it cashes a lot during the tests due to some fundamental differences. Therefore, some work still needs to be done by somebody who is more fluent with stack unwinding, as i am not familiar with uwcontext. Or if we can borrow some fallback implementation from mono, that would be equally helpful for porting coreclr to new platforms (with as good/bad unwinding support as mono, which is probably acceptable than not having it at all?) |
Another idea was to implement unwinding in C# using something like https://github.com/konrad-kruczynski/elfsharp. :) |
CoreCLR is sensitive to having properly behaving stack unwinder for manually managed code. I doubt that switching from one poorly debugged libuwind implementation to a different poorly debugged libunwind implementation would actually fix the crashes that you are seeing. The ultimate fix would be to get rid of coreclr dependency on libuwind by getting rid of |
That would not be sufficient to get rid of the dependency on libunwind. We also use libunwind for the first pass of managed EH to walk through runtime native code that's in-between. We need to do that to find possible native handlers of the exception. Only the 2nd pass actually let's c++ EH to unwind the native frames. See |
Is there a way to sanity test this somehow from C#? i.e. intentionally create such a situation in C# which causes exception on CLR native stack (in manually managed code), in order to assess the tangibility of unwinder. |
All places that call Running all libraries tests is probably the best way to exercise sufficient number of these. |
@jkotas how much is a lot of work? weeks? months? maybe years? It would be awesome to not have to care about things like libunwind. Maybe it's worth a treatise to find what would it take to get rid of each, even if that is farmed out to the community. |
months for sure Here is an example what it takes to convert one FCall with HMF: dotnet/runtime#1929. There are 400+ of these. If each of them is 50 lines delta, this would be ~20,000 delta total. We would also need to do something about the special uses of the unwinder like the one @janvorli mentioned, but that's probably cheaper problem than converting all FCalls. |
Thank you. It was easier to spot at least one weakness in locating native frame in pass1 on SmartOS: dotnet/runtime#38373. Sounds like it is a general goodness to convert FCalls to QCalls. There are total 503 occurrences of $ git grep '\[MethodImpl(MethodImplOptions.InternalCall)\]' :/src/coreclr/*.cs | wc -l
503 Folks are also porting/fixing libunwind for QNX OS and MIPS arch for CoreCLR. If we instead combine the effort to get rid of libunwind in few months, I think it will eventually improve the overall portability of runtime. ps: like CoreRT, rust-lang also uses llvm-libunwind for some targets, but it is optional. By default it has minimal implementation written in Rust to cater its exception handling needs. Maybe we can also implement a minimal unwinder in C# for CoreRT, using elfsharp etc. |
That behavior is correct and equivalent to Linux |
The other important place, as I've mentioned before, are the cases when managed exception handling needs to unwind through native frames that are in-between managed frames. One example of such scenario is throwing a exception from a method that was called via reflection and catching it at the caller site. The call stack you get at the throw is below. You can see that there is a managed frame reflectioninvoke.Program.Test, then there are four native frames and then managed frames again. The unwinding will start at reflectioninvoke.Program.Test and the Windows style managed unwinder will unwind to the libcoreclr.so!CallDescrWorkerInternal. Then we will start unwinding using the libunwind unwinder and checking NativeExceptionHolders in the unwound range. Those holders represent places in native code where we can catch the PAL_SEHException we are using internally to propagate managed exceptions. The holders have InvokeFilter method that is called to decide whether the exception will be handled by that native frame or not. If that returns true, we have found the handler and switch to the 2nd pass of exception handling that doesn't use our libunwind (it uses standard c++ exception handling for native frames). We start again from the reflectioninvoke.Program.Test in a similar manner, but once we reach the
|
We spend a lot of time in libuwind during stackwalking on Unix:
We should look into replacing LLVM libuwind with our own DWARF unwinder that just does what we need (ie may not need to support all DWARD codes), without the unnecessary overhead.
Mono has a prior art on this that may be useful: https://github.com/mono/mono/blob/1debf3934120547b3003c0ec4ec90bae4b08ee13/mono/mini/unwind.c#L515
The text was updated successfully, but these errors were encountered: