Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Message-Id field format #734

Closed
Nevca opened this issue Dec 13, 2021 · 9 comments
Closed

Message-Id field format #734

Nevca opened this issue Dec 13, 2021 · 9 comments
Labels
question A question about how to do something

Comments

@Nevca
Copy link

Nevca commented Dec 13, 2021

Describe the bug
I noticed that most email clients generate lowercase Message-Ids but MimeKit generates uppercase ids. This did not seem to be a problem before I stumbled upon a client that could not handle uppercase correctly (what it did was to change one part of the Message-Id to lowercase, leaving the other part intact). I know this is not supposed to happen but that made me research the topic further:

https://datatracker.ietf.org/doc/html/rfc2822.html#section-3.4.1 says the following:
An addr-spec is a specific Internet identifier that contains a locally interpreted string followed by the at-sign character ("@", ASCII value 64) followed by an **Internet domain**.

And the Message-Id field according to https://datatracker.ietf.org/doc/html/rfc2822#section-3.6.4 should follow addr-spec:
The message identifier (msg-id) is similar in syntax to an angle-addr construct without the internal CFWS.

In MimeKit code I see that in MimeUtils there's this const string base36 = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";.
Why are only upper cases being used? Shouldn't only lower cases be valid here, just like the domain names? Is there any reason or specification I am missing?

More importantly, to overcome this issue without having to change MimeKit's code locally (which I don't want to do), I have to call ToLower() which is not optimal here.

Is there a possibility to update the code or maybe introduce an option for this?

@jstedfast
Copy link
Owner

There's no particular reason it needs to be uppercase, I just chose uppercase in order to do the base36 conversion.

I'm not sure how you concluded that only lower-case letters would be valid from the documentation you pointed to as none of them make that claim. They just state that the interpretation should be case-insensitive which means both upper and lowercase letters are valid and that, for example, 'A' should be interpreted exactly the same as 'a' and vise versa.

What do you mean by a client "could not handle uppercase correctly"? Sounds like a bug in that client, not MimeKit. Right? What client is this?

@Nevca
Copy link
Author

Nevca commented Dec 15, 2021

I did not conclude that, I was simply referring to the fact letters are lowercase in internet domain names.

Yes, I know this is a bug in the client. The client is eM Client, what it does is to change the Message-Id (when it does a reply to the particular email) to lowercase letter only after the "@". Totally weird, I know...

I know how to handle that on my end, I just wanted to let you know about this, so you can make some changes if you think they're needed, e.g. option for MimeUtils to generate only lowercase Message-Id or something else.

@jstedfast
Copy link
Owner

You mean the domain part of the message-Id is uppercase?

The base36 constant is for the string before the @ symbol.

The domain name is retrieved from your local system and is used as-is without upper casing the domain name.

@Nevca
Copy link
Author

Nevca commented Dec 15, 2021

The whole Message-Id is uppercase.

Do you plan on introducing any changes like options for generating the Message-Id or should I just do ToLower() on my end?

@jstedfast
Copy link
Owner

I don't see how eM Client's behavior breaks anything. It shouldn't matter if, when you reply, it lower cases the msgid in the References and/or In-Reply-To header.

As you noted, the specifications say that these strings need to be treated case insensitively which means it doesn't matter if a client uppercases them or lowercases them when it replies to the message. It shouldn't matter.

@jstedfast jstedfast added the question A question about how to do something label Dec 15, 2021
@Nevca
Copy link
Author

Nevca commented Dec 16, 2021

What I saw was that if you have a gmail account and send an email from it, after eM Client replies to the email and changes the Message-Id, the gmail server does not treat this message as part of the proper email thread anymore. Originally, I was looking how to fix that.

@ekalchev
Copy link

We observed the same issue with EM Client. It is obviously not an bug in MailKit that needs fixing but having an option to customize how Message-Id are generated by Mailkit will be useful.

@jstedfast
Copy link
Owner

@Nevca That sounds like potentially a GMail bug with their threading algorithm.

FWIW, you can provide a lowercase domain by calling message.MessageId = MimeUtils.GenerateMessageId ("domain.com");

I just took a look at various messages in my inbox and there are a number of them that have uppercase letters in the local-part of the msg-id token. For example:

  1. GMail: CAK5W3isoOA_+dugCDShqrp0RDhB+kEm5ZoKsaOE2KS2pq-44LA@mail.gmail.com
  2. Outlook: MN2PR21MB15035BFF2D76BACBB5EE9FB1CF779@MN2PR21MB1503.namprd21.prod.outlook.com

At a minimum, both Outlook and GMail (which together probably account for 75+% of the non-automated email out there) contain uppercase letters in the local-part of the msg-id and Outlook uses capital letters in the domain name as well.

jstedfast added a commit that referenced this issue Dec 16, 2021
@Nevca
Copy link
Author

Nevca commented Dec 20, 2021

Thanks @jstedfast

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question A question about how to do something
Projects
None yet
Development

No branches or pull requests

3 participants