Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The ContentId field on MimeEntity does not accept parentheses #542

Closed
reinaldocoelho opened this issue Mar 19, 2020 · 9 comments
Closed
Labels
compatibility Compatibility with existing software

Comments

@reinaldocoelho
Copy link

Describe the bug
On forward an email with image on body and with CID like "SignaturePax(2)_21fe8b55-7001-4403-957d-3ca1e804edee.jpg" thrown an error when set ContentId with this value.

Platform (please complete the following information):

  • OS: Windows
  • .NET Runtime: NetStandard / NetCore 2.1 / FullFramework 4.6.1
  • .NET Framework: .Net Core and .NET 4.6.1
  • MimeKit Version: 2.5.2

To Reproduce
Steps to reproduce the behavior (Test Code Sample):

        [Fact]
        public void CidValidNameTest()
        {
            // Test1 - With BodyBuilder
            var text = "SignaturePax(2)_21fe8b55-7001-4403-957d-3ca1e804edee.jpg";
            var builder = new BodyBuilder();
            var contentType = new ContentType("image", "jpeg");
            var resource = builder.LinkedResources.Add(text, Convert.FromBase64String(""), contentType);
            resource.ContentId = text;
            Assert.NotNull(resource);

            // Test2 - With MailBoxAddress (used when create ContentId)
            var encoded = HttpUtility.UrlEncode(text);
            var parsed = MailboxAddress.TryParse(encoded, out var address);
            Assert.True(parsed);
        }

This thrown an exception like "Invalid Content-Id format.".

Expected behavior
Accept the text with parentheses as a valid structure on ContentId like RFC2392(https://tools.ietf.org/html/rfc2392) specification.
All URL encode must be valid, and so parentheses is valid to use.

@jstedfast
Copy link
Owner

jstedfast commented Mar 19, 2020

rfc2045 defines the Content-ID header as follows:

     id := "Content-ID" ":" msg-id

The msg-id token is defined in rfc822 -> rfc2822 -> rfc5322:

   atext           =   ALPHA / DIGIT /    ; Printable US-ASCII
                       "!" / "#" /        ;  characters not including
                       "$" / "%" /        ;  specials.  Used for atoms.
                       "&" / "'" /
                       "*" / "+" /
                       "-" / "/" /
                       "=" / "?" /
                       "^" / "_" /
                       "`" / "{" /
                       "|" / "}" /
                       "~"

   dot-atom-text   =   1*atext *("." 1*atext)

   specials        =   "(" / ")" /        ; Special characters that do
                       "<" / ">" /        ;  not appear in atext
                       "[" / "]" /
                       ":" / ";" /
                       "@" / "\" /
                       "," / "." /
                       DQUOTE

   message-id      =   "Message-ID:" msg-id CRLF

   msg-id          =   [CFWS] "<" id-left "@" id-right ">" [CFWS]

   id-left         =   dot-atom-text / obs-id-left

   id-right        =   dot-atom-text / no-fold-literal / obs-id-right

The ( and ) characters are not permitted in the newer msg-id token syntax.

If we look at the old obsolete syntax, we get:

   atom            =   [CFWS] 1*atext [CFWS]

   dot-atom        =   [CFWS] dot-atom-text [CFWS]

   local-part      =   dot-atom / quoted-string / obs-local-part

   obs-id-left     =   local-part

   obs-local-part  =   word *("." word)

   word            =   atom / quoted-string

Here, it is allowed (via obs-local-part syntax), but is considered a comment and is not part of the actual msg-id token as explained at the end of section 4.5.4:

   Semantically, none of the optional CFWS in the local-part and the
   domain is part of the obs-id-left and obs-id-right, respectively.

I think there's an argument to made for the ContentId property to accept it (certainly for the parser to handle it if it doesn't), but I'm not entirely sure how to handle it - should I strip out the "comment(s)"? Or should I leave them? It seems, in your example use case, that the "comment" is significant - i.e. it is a part of the id and is not meant to be interpreted as a comment in the middle of the obs-local-part.

@jstedfast
Copy link
Owner

@reinaldocoelho do you have full control over the Content-Id value in this case? Or are you taking an existing value from a message in the wild and re-using that value?

If you have control over the value, I would recommend munging the ()'s.

@reinaldocoelho
Copy link
Author

reinaldocoelho commented Mar 19, 2020

The system responds or forwards the messages and uses the same content received when sending.
At best, I can validate if throwed exception ​​and replace values, but I cannot guarantee the content received.

If I replace only the ()´s, other characteres can be received.

@jstedfast
Copy link
Owner

Ah, ok.

Are you parsing one message or part and then duplicating it over to another like this?

var copied = new MimePart (original.ContentType.MediaType, original.ContentType.MediaSubtype) {
    ContentId = original.ContentId,
    // ...
};

If so, one way to cheat here would be to do:

var originalMessage = MimeMessage.Load (stream);
var forwarded = new MimeMessage ();
// ...
forwarded.Body = originalMessage.Body;

A MimePart can actually be a child of multiple MimeMessages or Multiparts. Likewise, a Multipart can be a child of multiple MimeMessages or other Multiparts.

Just a little trick you might be able to use ;-)

@jstedfast jstedfast added the compatibility Compatibility with existing software label Mar 19, 2020
@reinaldocoelho
Copy link
Author

reinaldocoelho commented Mar 19, 2020

Always the problem occours on images in the body, than, to avoid replace the body always. I update the code with below:

                        var attached = builder.LinkedResources.Add(filenameToUse, streamImage, contentType);
                        try
                        {
                            attached.ContentId = item.CidHash;
                        }
                        catch (ArgumentException)
                        {
                            var editedHash = item.CidHash.Replace("(", "").Replace(")", "");
                            builder.HtmlBody = builder.HtmlBody.Replace($"cid:{item.CidHash}", $"cid:{editedHash}").Replace($"CID:{item.CidHash}", $"cid:{editedHash}");
                            attached.ContentId = editedHash;
                        }

But It´s not the perfect solution...

jstedfast added a commit that referenced this issue Mar 19, 2020
…ead of MailboxAddress.TryParse()

Part of the fix for issue #542
@jstedfast
Copy link
Owner

@reinaldocoelho Understood. Don't worry, I won't make you have to do gross hacks ;-)

Actually, due to ParseUtils.TryParseMsgId() being fixed to handle ()'s from issue #472, the above commit should actually fix things for you.

I forgot that ParseUtils.TryParseMsgId() already got the () fix, so it turns out that MimeEntity.ContentId setter just needed to be updated to use the more lenient parser.

Let me know if that doesn't solve it for you for some reason!

@reinaldocoelho
Copy link
Author

This is great @jstedfast.
Thank you for that.
Will it be released in version 2.5.3?

@jstedfast
Copy link
Owner

Yes, this will be released as part of 2.5.3, but I don't yet have plans for a 2.5.3 release so I'm not sure when it will be.

If you want, every commit produces an automated build on myget: https://www.myget.org/feed/mimekit/package/nuget/MimeKit

You can use that nuget feed as an alternative for the time being.

@reinaldocoelho
Copy link
Author

Thank you :-D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compatibility Compatibility with existing software
Projects
None yet
Development

No branches or pull requests

2 participants