Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need guidelines around max memory usage #393

Open
westonpace opened this issue Apr 23, 2021 · 2 comments
Open

Need guidelines around max memory usage #393

westonpace opened this issue Apr 23, 2021 · 2 comments

Comments

@westonpace
Copy link

This is related to #383 and #351 although in this case I'm looking more for guidance than a fix. I understand that mimalloc may overallocate and hold onto RSS after the memory is freed and so a system will need more RAM available to avoid running out of memory.

However, it is not very clear to me exactly how much RAM I will need. I am using mimalloc through Apache Arrow. In some cases, when processing an 8GB dataset, the RSS will grow to 27GB. This has made it difficult to figure out what kind of server we need to purchase / configure even though the size of our data is known. It will be even more difficult for more sophisticated approaches (such as a server that delays requests until the server has enough RAM to process them, ensuring OOM errors are avoided).

It's possible this behavior is a bug or some kind of bad scenario that can be avoided (I'm going to work through the suggestions in #351). However, even if that is the case, I would still like to have a better idea of the maximum amount of RAM that I can expect mimalloc to use.

@daanx
Copy link
Collaborator

daanx commented Apr 28, 2021

First of all, try to test with the 2.0.x version of mimalloc as that one tends to use less memory and release more aggressively.

However, generally mimalloc will only hold on to virtual memory and will return physical memory to the OS. Now, generally mimalloc flags unused memory as available to the OS and the OS will use that memory when there is memory pressure (MEM_RESET on windows, MADV_FREE on Linux) -- however, the OS does not always show that memory as available (even though it is) as it is only reclaimed under memory pressure.

If you set the MIMALLOC_RESET_DECOMMITS=1 environment option mimalloc will decommit the memory more aggressively so the actual memory usage becomes more apparent (but it is slightly more expensive to decommit vs reset).

Still, the peak memory usage is mostly determined by your program and actual memory usage. Easy to test by also running without mimalloc and seeing how the memory usage differs.

@westonpace
Copy link
Author

Thanks, running again at 2.0.x does seem to be much more aggressive. My current workload is using peak RSS of ~10GB with the system allocator and peak RSS of ~16GB with mimalloc < 2. Using version 2 I get a max RSS that's nearly the same as the system allocator. I was running into OOM killer issues on a 16GB server so I don't think it was RAM that had been MADV_FREE'd.

This is good. Thank you for your help so far. However, this still doesn't address the broader problem. Using 16GB instead of 10GB seemed like a bug but since it stabilizes and there is no guideline on how much RAM mimalloc "should" be using I don't really know for sure. A potential use case I have is building an data analysis server. Requests come in and run analytics on some portion of data. I can determine how much data the request requires (in bytes) and so I need to figure out if I have enough RAM to process the request right now or if I need to block until one of the currently running requests is finished. Ideally I'd like to avoid getting close to 100% RAM utilization and probably target something like 90%.

So a concrete example of the challenge I am facing is...

Currently I have 13GB free.
A request comes in that I know will use 10GB of data + 100MB for storing the results, intermediate allocations, etc.
Can I process the request now?

If I am just using malloc then I know I can do so.
If I'm using mimalloc then it becomes very tricky to know.

If 60% overhead is "technically possible" then I have to play it safe and queue the request. However, if 60% is "definitely a bug" (possibly in the way I am using the allocator) then I can be aggressive and fix my allocation behavior / mimalloc version when bugs are encountered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants