Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarks - Some comments #886

Closed
ramezrafla opened this issue Feb 19, 2023 · 6 comments
Closed

Benchmarks - Some comments #886

ramezrafla opened this issue Feb 19, 2023 · 6 comments
Labels
question A question about how to do something

Comments

@ramezrafla
Copy link

Dear @jstedfast
A quick note to say thank you for all that you have done for email parsing and handling for the community. A big debt of gratitude is owed to you.

First question:
We developed our own MIME parser in JS (not Node, our own Just/v8-based framework, so native-like) with lots of C++ bindings (Base64, QP, Textdecoding etc.). We are able to achieve similar benchmark numbers as yours. In fact, we worked hard to get to your numbers.

If we focus on Xamarin3:

  • Mimekit is ~3.4ms per email, we are at ~4.5ms per email. But we do full UTF8 decoding of header fields and creating a JSON object ready to be pushed into the DB (we also do some Mime validation to start SPAM scoring on the fly to avoid reprocessing later in the chain), don't think you do that much in your basic Parse command.
  • If we add a further 8ms (so total around 12.5ms) we also decode the attachments from base64 and save to disk (done natively).

I took a glance at the parser in MimeKit (and GMime). It seems to be a lazy/opportunistic parser. Wanted to make sure that we are comparing apples to apples.

Second question:
The other parsers you are comparing to seem to include a lot more work in their parser that you do later when fields are requested. Is it a fair apple-to-apple comparison here?

Thanks again! This work is a great instigator of performance improvement for the entire email ecosystem. Wildduck talks about 30 emails / s. With a good parser like yours we can hit 1000 emails / s.

@jstedfast jstedfast added the question A question about how to do something label Feb 19, 2023
@jstedfast
Copy link
Owner

jstedfast commented Feb 19, 2023

A quick note to say thank you for all that you have done for email parsing and handling for the community. A big debt of gratitude is owed to you.

Thanks :-)

We are able to achieve similar benchmark numbers as yours. In fact, we worked hard to get to your numbers.

sounds awesome!

I took a glance at the parser in MimeKit (and GMime). It seems to be a lazy/opportunistic parser. Wanted to make sure that we are comparing apples to apples.

Yes and no. Let me try to explain.

MimeKit's MimeParser (and GMime's GMimeParser) mostly just splits header blocks into individual header field & value pairs. It does not decode the headers nor does it decode MIME content. So far you are correct.

MimeKit's MimeMessage/MimeEntity/Part/etc (and GMime's GMimeMessage/GMimeObject/GMimePart/etc) would then parse/decode many of the individual headers as the *MimeParser fed those headers to those parts.

GMime's GMimeMessage handled fewer of those headers than MimeKit's MimeMessage (MimeKit handled additional headers like the Resent-* headers, the Priority headers, etc that GMime never did) and is likely one of the reasons that MimeKit's parser is (was?) a little slower than GMime (I think MimeKit's parser is an incremental improvement over GMime and I have not gone back and put in the effort to bring those design improvements back to GMime, unfortunately).

Since then, I have restructured MimeMessage and I think MimeEntity/etc parts to be more lazy in that it now defers parsing of the Subject/From/To/Cc/Bcc/Date/etc headers until the corresponding MimeMessage property is accessed (e.g. MimeMessage.Subject, MimeMessage.From, etc). So again, your assessment is fairly accurate.

When I originally wrote the benchmarks, however, this was not the case - MimeMessage was pretty greedy (in that it would immediately decode every header that it had a corresponding property for). In fairness, even if that were still true, you wouldn't technically be wrong in that it doesn't decode every header, so you're right in that it's not an exactly apples-to-apples comparison.

Second question:
The other parsers you are comparing to seem to include a lot more work in their parser that you do later when fields are requested. Is it a fair apple-to-apple comparison here?

It's less of a fair (apples-to-apples) comparison now than it was when I originally wrote the benchmarks, for sure.

I would suggest taking a peek at MimeKit's MimeReader and the ExperimentalMimeParser (which is built on it) that I hope to replace MimeParser with once I work up the courage to make that switch. With so many people using MimeKit now and depending on its reliability, I am nervous about making that switch even though I beefed up my unit tests and made sure that the results matched in both cases. It's still nerve racking!

My guess is that if you are pulling off 4.5ms even after doing full rfc2047 header decoding to UTF8 along with SPAM processing and base64/qp decoding of content(?), then that very well could be as fast or faster than MimeKit (if there was a way to do a more apples-to-apples comparison). Pretty impressive!

Is this an open source library?

@ramezrafla
Copy link
Author

Thanks @jstedfast for your gracious feedback and time to answer.
We can only advance as far as the generosity of people as yourself.

The library is not open source, not because we want to close it, but that it depends on Just-JS -- see more here: https://www.techempower.com/benchmarks/#section=data-r21&test=composite (#1 framework in terms of speed).

I'll separate into its own repo and share shortly. You won't be able to easily run it, but can see the code (the JS is simple, it depends quite a bit on C++ code for Base64 and QP). The Just-JS developer used the v8 engine and wrote his wrapper / event loop around it, focusing on Linux-only. So no bloating for multi-os. Lighting fast! At the expense of junior-developer friendliness. With NodeJS we wouldn't be able to be even close.

However, we were very heavily inspired by this famous library: https://github.com/postalsys/postal-mime. Its core strength is a single buffer that it traverses (Vs. strings, regex etc.), with the following key changes:

  • Native B64 -- https://github.com/aklomp/base64 (turbo base64 is also very good)
  • Native QP
  • Many fixes to speed up performance - such as zooming through attachments until we get to boundary, then feeding to native to decode
  • Many fixes to create custom classes vs regular objects, and strong respect monomorphism for JS performance

@ramezrafla
Copy link
Author

And

  • We avoid char / string copying like the plague - we move pointer to main buffer around

@jstedfast
Copy link
Owner

Native B64 -- https://github.com/aklomp/base64 (turbo base64 is also very good)

I think I took a look at this a few years ago and it was very impressive, but it looked like (and I didn't test to verify this) it would break when fed base64 content that had line breaks (as found in MIME). How did you handle that? Or was I just wrong?

I suppose you could base64 decode line by line, but my theory was that that would destroy the performance gains that it would have over MimeKit's base64 decoder (again, untested, and I should really play with it to find out).

Many fixes to speed up performance - such as zooming through attachments until we get to boundary, then feeding to native to decode

Similar approach to MimeKit I guess, although you might have a better algorithm than MimeKit does... I've seen boyer-moore approaches where as MimeKit's parser loops line-by-line and then matches at the beginning of each line.

We avoid char / string copying like the plague - we move pointer to main buffer around

Yea, MimeKit has far less of this than any of the other C# libraries I've seen, but to be honest, there's still more of it than I would like. There's definitely still room for improvement in this area that I could be doing in MimeKit.

@ramezrafla
Copy link
Author

ramezrafla commented Feb 19, 2023

I think I took a look at this a few years ago and it was very impressive, but it looked like (and I didn't test to verify this) it would break when fed base64 content that had line breaks (as found in MIME). How did you handle that? Or was I just wrong?

Very crude, but works :)

  for(size_t i = start; i <= end; ++i) {
    ch = message[i];
    if (
      ch == 43 || ch == 61 ||
      (ch >= 47 && ch <= 57) ||
      (ch >= 65 && ch <= 90) ||
      (ch >= 97 && ch <= 122)
    ) {
      message2[len2++] = ch;
    }
  }

Make a copy of buffer without the non-base64 chars. We learned not to try to be smart sometimes and go for back-to-basics. This way we just pass a link to the buffer with offset and len, and get back base64 decoded from C++. One line of code in JS.

Similar approach to MimeKit I guess, although you might have a better algorithm than MimeKit does... I've seen boyer-moore approaches where as MimeKit's parser loops line-by-line and then matches at the beginning of each line.

In our case, we were able to do most reading in fast forward mode to the next boundary. Only headers are line by line. No performance benefits to handling any differently as we need to manually parse headers anyway. Once past the main header, all internal headers / boundaries are a few lines at the most.

We avoid char / string copying like the plague - we move pointer to main buffer around

This is really the strength of the algorithm. In fact, we are considering overwriting the original buffer in the future instead of mallocs (base64 decoded will always be smaller than original, so just overwrite while decoding, same for QP).

@ramezrafla
Copy link
Author

And on the way back in (to create MIME), here is how we split base64-encoded

char* message2 = (char *)malloc(len + int(len/38) + 2);
size_t len2 = 0;
size_t remaining;
size_t i = 0;
while (i < len) {
    remaining = len - i;
    if (remaining > 76) remaining = 76;
    for (size_t j = 0; j < remaining; j++) {
      message2[len2++] = message[i++];
    }
    message2[len2++] = 13;
    message2[len2++] = 10;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question A question about how to do something
Projects
None yet
Development

No branches or pull requests

2 participants