-
-
Notifications
You must be signed in to change notification settings - Fork 828
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High memory traffic from MemoryBlockStream #725
Comments
Can you try these methods yourself and report back to me on how they perform? I don't have access to tooling to compare (one of the downsides of developing on a Mac). Thanks! |
I've got a patch to use ArrayPools that I'm about to land. Perhaps you can test it. If it doesn't solve the problem, could you try replacing uses of MemoryBlockStream inside of MimeParser.cs (and in ImapFolder.cs?) and see if that solves the problem? |
Regarding Can you find out where messages are being written to a As far as using |
Question on ArrayPool: How many buckets does an ArrayPool allocate? Just the 1? Or will it create a number of buckets as it needs them but just not go above a hard-coded threshold? |
I ran the DefaultArrayPool contructor because it is hard to read the bit operations they have For these values var maxArrayLength = 2048; 8 buckets are generated. Bucket 0 = 16 Buckets are only containers for which arrays will be allocated, so only Bucket 7 will be used by MemoryBlockStream As for the maxArraysPerBucket - this will be a good idea to be configurable. For embedded systems - 2048 * 200 ~ 400kb may be is too much. The problem with that size is it will never be released by the array pool. |
Ah, so that's what they mean by buckets. Ok. It sounds to me, then, like ArrayPool() is not the ideal solution to this problem because all other array sizes will go unused. I think a better solution will be to use something like an ArrayPool, but with a single buffer size. |
Keep in mind those arrays in the buckets are not pre-allocated. When array is requested with Rent, only then the array is created and added in the bucket. MemoryBlockStream will request only byte arrays of 2048 bytes and other buckets will stay empty. No memory will be wasted in other buckets. Here the bucket code |
Regarding MemoryStream.set_Capacity I am using ImapFolder.GetStreams method and then I construct MimeMessage with MimeMessage.Load for each stream. Then I am using custom MimeVisitor to get all the message parts as MimeEntity. Then for every MimeEntity i am calling mimeEntity.Headers.WriteTo to save header and body in my database. MemoryStream.set_Capacity is called from WriteTo methods. Is there more efficient way to do that? |
Well, both of those WriteTo() methods take a Stream argument. It doesn't have to be a MemoryStream. You can use whatever stream type that you want. Seems like you should use something else... Another option might be to use a MimeKit.IO.MeasuringStream to write it to that first to measure how much buffer space you need, then allocate a MemoryStream with that capacity, then call WriteTo() again on the MemoryStream? It might be slower since you're writing the headers/mime parts twice, but the penalty for that might be less than the penalty to realloc the MemoryStream's buffer and copying all the data to the new internal buffer. |
Hmm you are right when I saw the call stack I thought that MailKit is doing some internal allocations but the problem is the memory stream I am passing as argument. MeasuringStream will be really useful for me. I am using SQLite and you can open a stream to SQLite blob column and directly write to the disk. The problem I had is that I didn't know what will be the output size which is required when creating SQLite blob column. This way i'll be able to avoid MemoryStream completely and directly write to the database. Thanks a lot! |
No problem :) |
I am using MailKit for about an year and I notice memory spikes from time to time. In this particular case I am downloading full messages (header + message parts) from a local mail server (high bandwidth).
The process memory jumps from 300mb to 1gb for about 2-3 sec. I can see with the memory profiler the allocations are byte arrays of size 2048 bytes. They are allocated in MemoryBlockStream. Those allocations are not memory leaks as you can notice at 51s I forced GC and everything is gone. Still having so much memory traffic is not very good.
I can see that each MemoryBlockStream holds a private list of byte arrays that are thrown away after the stream is disposed.
What if a shared pool of byte arrays is used instead of allocating them for each stream?
https://github.com/dotnet/corefx/blob/master/src/System.Buffers/src/System/Buffers/ArrayPool.cs
or
replace entirely MemoryBlockStream with https://github.com/Microsoft/Microsoft.IO.RecyclableMemoryStream
The text was updated successfully, but these errors were encountered: