-
Notifications
You must be signed in to change notification settings - Fork 526
Deadlocks (?) on Kestrel. #271
Comments
Just a theory here, but libuv is single threaded. If you're using blocking IO in your code, you might end up with a deadlock somewhere. |
@Flavien We don't run user code on the UV loop. |
@davidfowl Ok, that makes sense. |
I got a 502 Bad Gateway on Linux. |
We are getting Bad Gateway on linux as well. After RC1 update, our asp.net app hangs on the first request until Bad Gateway is returned to the user. Same happens with scaffolded basic web app from "yo aspnet". However it is fine on OS X. So it seems it's solely issue for linux platform? |
To make any sort of progress in his bug we're going to need more specifics:
|
I notice that when i run a container, the COMMAND is |
OS: Ubuntu 14.04 64bit The bug can be reproduced (at least on our machine) using "yo aspnet" generator and choosing Web Application Basic (other templates like the one with Identity and MusicStore app from github seem to function ok) Reproduce steps: Last thing that was printed from kestrel in the console was: When using Verbose logging, the last print before "death" was: Our application fails the same way. |
Maybe I know the reason. Otherwise, you should use |
Can you be more specific? What are you using to generate load? Are you just hitting f5 in the browser?
This is unrelated to the hang right? |
Yes, load in the browser.
It's just to mark at which point it hangs based from output. |
Is this Mono only or does it repro on CoreCLR as well? |
Pinging the issue. Did you guys see similar issues with CoreCLR? |
On our setup, with CoreCLR Kestrel does not hang. |
@introsuit Yup, that sounds like a separate issue. If you can provide repro steps, please feel free to file another ticket for that issue. |
@introsuit Are you reproing this with current RC2 bits? |
@muratg I probably got the same issue on RTM. It is rare, I have noticed similar symptom twice. Here is some information about it: We have a service on ubuntu 16.06 x64, running RTM aspnet core. It is a typical asp.net app, serving some pages via MVC. And the EF error is like Not sure if it is a kestrel bug or dotnet core bug, I didn't know how to dig deeper and for sake of our service, I had to restart the service. Any advise on what should I do next time I see this happens? Thanks! |
@CesarBS Other than the Microsoft middlewares, my own ones are doing something like:
or
Basically just write the response. Is async/await related? |
@CesarBS and btw, for my case, it is not timing out for some requests, once it happens, the entire app is just like dead, every request will timeout. |
@txchen check context.Response.HasStarted before you attempt to modify or write to the Response after calling next. |
@Tratcher thanks a lot for the tip. I will check that. But will that lead to entire app dead? Or it would just fail the current request. I just want to raise the concern that maybe there is still unknown deadlock bug. |
@txchen How large is your app? Are you able to provide us a minimal repro of what you're seeing? |
@CesarBS not very large one, MVC has about 10 pages. Inside the app, we also have some APIs, RPS is about 50 - 100. Usually it is quite stable, but I saw this dead bug twice. (in these 3 weeks) I really want to find the repro step as well, but I still cannot. The interesting part as I said, is the RecurrenTask is still working, but inside the task, all the operation in EF would get exception, complaining about cannot get connection from pool. I have double checked my code, it should not leak connection since I use using (context) every where. I think even EF has something wrong, it should not impact rendering my home page as it has nothing to do with DB. So the root cause is either in dotnet or kestrel I assume. I don't know how to take dump or something on linux for dotnet core, if you can shed some light, I can try to get something when I see it next time. |
@txchen It's probably not related to your current issue. |
Looks like no repro. If you're still seeing issues, please file another bug. Re: taking dump on linux, I think you can use gdb. Not sure how easy that would be. If you can run on Windows and hit the same issue, you can use procdump. |
@Bartmax Hi Bartmax, did you fix the issue ? I am having similiar simtoms. Looks like all outbound connections from .net proccess stop including connection to db using EF (same error as yours) and connection to solr. Ubuntu 16.04. Behind nginx. |
@hheexx this was open on 2015. Lot of stuff changed since then and no, I had no more problem since ages. |
I saw the date but I have very similar problems and no idea what it could be. Very strange. Thanks! |
Are you sure the request it's not just timing out ? |
I tried to make a repro but failing miserably, it's too inconsistent.
When this happens, it's at the start of a request, in debug mode, no breakpoint is reached. if you pause and continue, the request will not take place untill you make another request, then both request fire together. (but not always)
The error if you let it 'timeout' is
It happens very often but couldn't find a way to repro in a consistent way...
Projects with
EntityFramework
and database access looks like fails often.This does happens with kestrel with and without
HttpPlatformHandler
.This does happens with RC.
This does not happen with web listener.
Sorry, maybe this is not much help, but I get constant stalls on the applications I'm working on, so if you need some specific details, just let me know, I'm more than happy to help.
The text was updated successfully, but these errors were encountered: