-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Struggling with high memory usage #14117
Comments
@rjpowers10 have you tried to disable cache from the general site settings? I am guessing there is a cache issue of maybe too many cached object in memory? |
@MikeAlhayek thanks for the suggestion but in both my examples (my application built on OC and a plain old OC blog) the cache setting is set to "from environment". |
Have you tried to disable all sites except for the default one? This may help trying to understand what is causing the high usage. If the default site has normal memory usage, try to enable one of your blog sites and see if you can pin point the site. Then maybe start disabling one feature at a time to see if you can pin point the problem to a single feature. This may give us more clues. Also, check your log file for any silent errors that would be causing memory consumption. Not sure if this will help, but something for now. Hopefully, @jtkech or @sebastienros have more ideas here. |
How much memory is on the system in total. Take a memory snapshot between two jumps and check the diff. Run in Release, the fact that I see VS might mean it is a debug session. |
My first example (my app built on OC) is a release build. My second example, which was a plain old OC site with the blog recipe, was in debug. The plain old OC site might just be a red herring, I just found it curious that memory didn't seem to go down after a GC collect. May not be relevant at all to my actual scenario. The machine has 16GB of memory. SQL Server is also running which takes about 2GB itself. With my monitor I see the memory getting down to 0 and then as expected, the system starts paging and my tests start timing out. I have a memory diff but I'm having trouble interpreting it. Wondering if anything jumps out to anyone. |
I'm going to try limiting my test suite. Maybe there is one test in particular that exemplifies the problem. |
RouteEndpoints in the diff would be explained by a tenant being loaded. Coming back to your original description, the issue is that your test is taking 8.5Gb, can you try to do the same thing with ASPNETCORE_ENVIRONMENT in Production, not "Development", just to see if that has an impact. Can you take a memory dump when the app is close to the end to see what is still kept in memory? |
I ran the test suite again, this time with ASPNETCORE_ENVIRONMENT set to 'Production'. It didn't seem to make a difference. The tests finished in 1 hour, 15 minutes. This is what it looked like at the end of the test run. This is the memory dump at the conclusion of the run. |
Our memory issues are consistent with this as well. These are our worst offenders in terms of memory usage from our latest performance testing while trying (and failing miserably) to run with 1000 tenants.
|
Time to do another pass at it! I will try with my own "script" but I'd like to be able to repro something that matches your experience. @rjpowers10 how many tenants are you loading and what features do they use? Ideally if you could do the same experiment and confirm it with the smallest possible list of features that would help. @ShaneCourtrille thanks for the confirmation. I'd love to achieve a density of 1000 on 8GB of RAM, we were there at the beginning, but obviously more features and complexity have been added since then. |
@sebastienros I'll try to help as much as I can though it may be difficult to repro my situation. We built an eCommerce platform on top of OC and wrote many custom features. Because I'm having trouble figuring out the offending code it's hard for me to say if it's an issue in my code or OC. My tests do a lot of the typical eCommerce things (sign in, register, view product, add to cart, checkout, order history, etc.). I wonder if something is sticking around after each ViewResult so it's simply the fact that my tests are making a lot of requests. That's why I tried a simpler test with a plain old OC site and the blog recipe, but it's hard to say if that really replicated the issue. List of OC features we have enabled:
|
@rjpowers10 no tenants? |
@sebastienros Oh sorry, we do have two tenants, one for humans and one for automated tests. All the tests are hitting the same tenant. The test are also running serially. The tenant for humans usually isn't doing much at this time though, since the tests run overnight. |
@rjpowers10 Out of curiosity.. have you taken a look at your strings? I'm doing some analysis and found an oddity where an estimated 80% of strings that are 46 bytes long are the value "mvc.1.0.view" being repeated over and over. What's really painful is that these strings (or at least the ones I've been able to sample since gcroot is such an expensive operation) don't have a gcroot but are just sort of sitting around. |
There are a ton of strings in my memory dump, although only four instances of "mvc.1.0.view". EDIT: Whoops, read the memory dump wrong. I have over 260,000 occurrences of "mvc.1.0.view". |
When you are in Development (there is an env variable) and/or in debug moder under Visual Studio, the GC may not work as expected, I already saw this kind of memory increase. Try to build in Release (maybe not necessary), at least start the app from the command line and then look at the Task Manager. Then it may depend on your machine to decide when you are under memory pressure or not. |
@jtkech is right, the debugger will keep some variables "rooted" so you can inspect them and hence prevent them from being collected. Use a tool like PerfView or dotnetMemory profiler to take snapshots of the memory from a release run that you can then analyze in these tools. |
I could repro the same result with release mode and |
@wAsnk Can you repro in a deployed environment? You can use dotnet dump collect to get a memory dump and then look at it locally in VS. |
When I will have time I will re-install PerfView and look at it more in depth, for now my Visual Studio doesn't allow me to do 2 concecutive memory snapshots. but in the meantime I could repro by just looking at the memory usage, when doing a lot of tenant release the GC works as expected, in my case it stabilizes around 700 Mb (but in debug mode). But yes after having enabled the I will look at this asap. Did you see the same behavior but by enabling another feature? @ShaneCourtrille and others About the But yes, all |
Resolving Note: Maybe still a memory problem with typed HttpClient but less critical if only built as needed. |
No, I couldn't repro with every other feature on.
Sounds good! |
This reminds me of this PR: dotnet/corefx#19082, might have some useful insights there. |
We need to use |
So the problem is because of an mvc action filter that resolves on each request To fix the issue I only needed in For now the only thing I needed to update is only one line, I will commit it to show what I did. |
Here what I did for now #14335 |
@rjpowers10 I'm still not done looking but my initial checks look good for the mvc.1.0.view fix and in our case that saves us 500MB of memory alone. /cc @jtkech (FYI) |
@sebastienros Not related ;) I made good progress in #14348 for the Reloading part, not fully tweaked but can be already tested. Note: I was wrong, the tenant is not reloaded on Features change, it is only released and then rebuilt based on the updated shell descriptor. The tenant is reloaded from the Admin Tenants UI or by code. But maybe you are reloading tenants in your tests, anyway worth to re-try as I did other changes. Didn't have time to focus on the razor strings, I will do, didn't see yet the meeting where it was discussed. Note: Just saw a razor hot relaod that clears caches on application events (so not tenant events), maybe a good path to follow. |
@jtkech could the actual issue with HttpClient is the usage by not disposing it? Meaning doing this var client = _httpClientFactory.CreateClient();
// never dispose the client instead of using var client = _httpClientFactory.CreateClient(); or
|
@rjpowers10 @ShaneCourtrille and others. Here a little summary of what has been done done in #14348, if you can give it a try.
About razor view compiled items, we already have a static shared compiler provider but that we only use if razor runtime compilation is enabled, now in #14348 we always register it.
We still load the Doing so it seems that the GC behavior is better, here after many Tenant Release / Reload. |
@jtkech Awesome! I will test what I can, but I can only really test the stuff that can be overridden in DI. I've just been copy-pasting your PR changes into my code, since I will need these fixes before the next OC release. |
@jtkech I tested it in our own solution with local NuGet files and it looks really good. I will do further testing in the upcoming days. |
@rjpowers10 No problem, no pressure, when you will have time. @wAsnk Okay cool, thanks for testing, I cross my fingers. |
@rjpowers10 @ShaneCourtrille @MikeAlhayek @wAsnk @sebastienros I saw one of the last meeting related to string allocations, interesting. For info by using instance ids under the debugger we can see that compiled items loaded from assemblies hold different instances of the In #14348 we now override how compiled items are loaded and we can see that the same instance is used. Also I could reduce the number of times the compiled items are loaded, before 3 times per tenant building, for example the view descriptors are populated on demand by models providers (one from aspnetcore, one from orchardcore), now only once by caching the descriptors built in a given shell scope. We can still have duplicate strings but they are no longer rooted and can be reclaimed, the only one holding these strings being our static compiler that we now always use, and now it only references one instance of |
Sorry, I've been meaning to run some more tests on my solution but I've been swamped this week. The PR has also grown considerably, to the point where it would be easier if I had NuGet packages. Maybe I will pull the branch and make some local packages to test with. |
@wAsnk Thanks a lot for your tests. Yes, I've made other changes, for example to not call But I think we are okay the main goal being to keep the memory reclaimable by the GC. What's your own conclusion? @rjpowers10 No problem, only when you will have time. |
Yeah, I agree. |
Okay cool then, thank's again for your tests. |
@jtkech When Database Shells Configuration Provider is used there is an exception on startup:
|
Oop, will look at it this night, thanks. |
Okay, we try to reload to prevent to rebuild some configs, I will fix it this night. |
Okay, fixed by #14499 |
For posterity, I updated my app to Orchard Core 1.8, the first release containing the fixes from #14348. In this graph you can see a big improvement in memory consumption between OC 1.7 and 1.8. |
I created a background task to collect garbage every minute
|
This seems unnecessary and potentially degrading for the overall performance, since it's blocking and while may reclaim memory, uses CPU. Outside of debugging and otherwise very special scenarios, you shouldn't (have to) run the GC manually. Note that on servers the point is not to keep memory usage low, but to serve the most users with good performance. This may mean keeping memory usage high, but steady. This in itself is not a problem. |
This is a long time ago test, I'm going to try to disable it now and see what happens with automatic memory reclamation |
Agreed with @Piedone that we shouldn't be calling |
Describe the bug
My scenario: We have a site built on Orchard Core 1.6. We have a pipeline that periodically deploys an instance to a test server and runs a suite of UI tests against it (the tests are using MSTest and Playwright, if it makes any difference). The site is running in IIS. The test suite runs for about 1 hour, and by the end of that run the w3wp process is consuming over 8.5 GB.
ASPNETCORE_ENVIRONMENT
is set to "Development".Here's a quick look at the memory usage over time. This is the memory of the entire machine, not just OC, but looking in Task Manager, I see w3wp using over 8.5 GB.
To Reproduce
I've been struggling for quite a while on pinpointing the root issue, with no success thus far. I've tried using the Visual Studio memory profiler but I've been struggling to make any sense of it.
I next tried a much simpler example. This next example is a plain old OC site (latest from the main branch) using the blog recipe. That's it.
ASPNETCORE_ENVIRONMENT
is set to "Development". I then wrote a small PowerShell script to ping the site over and over.I ran this script for a little while. Looking at the diagnostic tools in Visual Studio, I can see multiple garbage collector invocations (the yellow arrows), but the memory never seems to decrease. I was hoping to see a more jagged pattern in the memory usage.
At this point I'm looking for any ideas or advice. Thanks.
The text was updated successfully, but these errors were encountered: