Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Double Quotes to Filename #674

Closed
czioutas opened this issue May 18, 2021 · 13 comments
Closed

Add Double Quotes to Filename #674

czioutas opened this issue May 18, 2021 · 13 comments
Labels
compatibility Compatibility with existing software enhancement New feature or request

Comments

@czioutas
Copy link

Hello,

We are using the library to send emails etc within a very specific industry.
We are obligated to send the filenames with double quotes regardless if they contain any illegal characters.

Example Generated

--=-KNeQ5Dji3GHnv3qrBJmqjw==
Content-Type: text/plain; name=Z0144e194dd3ae847a0bd97229b006351d6.ldt
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
        filename=Z0144e194dd3ae847a0bd97229b006351d6.ldt
Content-Description: LDT-Labor-Auftrag

Example Required

--=-KNeQ5Dji3GHnv3qrBJmqjw==
Content-Type: text/plain; name="Z0144e194dd3ae847a0bd97229b006351d6.ldt"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
        filename="Z0144e194dd3ae847a0bd97229b006351d6.ldt"
Content-Description: LDT-Labor-Auftrag

I know this is not fully correct but I was wondering if its possible.

I tried by added char attachment.FileName = (char)34 + attachment.FileName + (char)34; however this resulted in escaped double quotes such as : Content-Type: text/plain; name="\"Z01b6052dd3e54a45db851f044512bbf2dc.ldt\""

Any ideas?

@jstedfast
Copy link
Owner

Ugh, what is this broken software that requires parameters be quoted?

Why did these developers not read the specifications?

WHY!?!?

Sigh.

Currently, there is no way to enforce this in MimeKit. Maybe I can add something...

In the meantime, you could change 1 line of code in Parameter.cs:284 and change:

var method = EncodeMethod.None;

to:

var method = EncodeMethod.Quote;

@jstedfast jstedfast added compatibility Compatibility with existing software enhancement New feature or request labels May 18, 2021
@czioutas
Copy link
Author

czioutas commented May 18, 2021

ye.. I have the same question. At first I tried to add a space in the middle of the name so that it forces the double quotes but unfortunately they also have a regulation about the filename characters :)

German healthcare - dont even ask! tbh I think they got confused with the HTTP version of Content-Disposition which is double quoted link

Quick clarification @jstedfast so you are saying to get the source, change the parameter and then use the dll right?

@jstedfast
Copy link
Owner

yea, that's probably the quickest way for now.

@czioutas
Copy link
Author

great thanks! (I dont know if you want to mark it as resolved etc. so I leave it to you)

@czioutas
Copy link
Author

I included all of Mailkit with the submodule of the customized mimekit as it was quite straightforward (although increase my repo size 😅.

I was looking at the source code if we can change the encoding, do you think this could be acceptable as custom or you wont want to diverge from the RFCs/default?

Content-Type: multipart/mixed; boundary="=-Q3nd1Evt3dhTlgFdQUTH1w=="

--=-Q3nd1Evt3dhTlgFdQUTH1w==
Content-Type: text/plain; charset="utf-8"


--=-Q3nd1Evt3dhTlgFdQUTH1w==
Content-Type: text/plain; name="Z012e060cee1bb34dec8f75fcfbd8e0af5c.ldt"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
        filename="Z012e060cee1bb34dec8f75fcfbd8e0af5c.ldt"
Content-Description: LDT-Labor-Auftrag

@jstedfast
Copy link
Owner

jstedfast commented May 19, 2021

Are you asking if I could add a ParameterEncoding.Quote enum value?

I was trying to figure out how I could do that and make it make sense yesterday, but I think I'll need a new property. I think that adding a .Quote enum value would just make it confusing. For example, if the value contains utf-8 characters and needs to be encoded and the value is set to Quote, what should MimeKit do?

The 'Quote' (or 'AlwaysQuote' or whatever I end up naming it) property should really be interpreted as "if MimeKit doesn't need to use rfc2047 or rfc2231 encoding to make the parameter value safe for transport, should it always opt to quote the value even if it technically doesn't need to?"

FWIW, I am seriously considering adding this feature (which is why I haven't closed it), I just haven't figured out the best way to present it as an API yet.

My current thoughts are to do something very similar to the way the ParameterEncoding stuff is done:

  • Add bool FormatOptions.AlwaysQuoteParameterValues (or perhaps I'll think of a better name)
  • Add bool Parameter.AlwaysQuote (again, maybe a better name will come to me)

I also plan (in MimeKit v3.0 - perhaps sooner) to remove the FormatOptions.Default setter restrictions - currently you can't override the default FormatOptions, but I think my reasoning for restricting that initially isn't necessary and it would maker it easier for developers to override the defaults globally and have everything Just Work(tm).

e.g. right now if you try to set FormatOptions.Default.ParameterEncoding = value; it will throw an InvalidOperationException.

@czioutas
Copy link
Author

So the ParameterEncodingMethod.cs enforces to use Default or the other 2 RFC options.
If we would change the parameter encoding via a value then we would be violating the default or the 2 rfc options right?

I was wondering if we could make it more flexible so that if there is a Custom Encoding method. So maybe another enum value?

I think it depends also on a general question, do you want to take the approach to set this on per parameter level so I can for example double quote the filename of content-Disposition but not the filename of content-type or to allow a user (of the library) to set a general custom solution?

@jstedfast
Copy link
Owner

The ParameterEncodingMethod enum exists to work around historical suckage of email specifications.

Quick history lesson:

Originally, email was ASCII-only. Headers and message bodies were restricted to US-ASCII and there was no such thing as attachments. Pretty quickly the desire to send files in emails became popular and there were competing standards such as UUEncode and who-knows what else.

Along came MIME which provided a hierarchical document format for email and it became adopted. That said, SMTP servers still required 7bit email messages and so MIME had been designed to squash everything down into the 7bit range. This is why attachments are base64 encoded. The early MIME standards also introduced the idea of a charset parameter for text/* parts so that message bodies could be sent in something other than US-ASCII. They also defined the rfc2047 encoding scheme for headers such as Subject, To/From/Cc/etc.

Unfortunately, MIME did not originally define a mechanism for encoding parameter values in headers like Content-Type or Content-Disposition, so some clients decided to just use the rfc2047 encoding even though it was awkward to use for that purpose (I could explain the why's, but it's not really important).

After about 10 months, rfc2184 (replaced by rfc2231 a few years later to make things clearer) came along and defined a mechanism for encoding parameter values but by that point many clients had already implemented this functionality using rfc2047's encoding rules.

To this day, some clients haven't been updated to use rfc2184/rfc2231's encoding rules for parameter values (or maybe customers just haven't updated their ancient email clients?) and so sometimes for interoperability, developers might need to force encoding in rfc2047 format.

(Note: MimeKit supports decoding of both formats w/o needing to set a configuration setting)

Anyway...

So the ParameterEncodingMethod.cs enforces to use Default or the other 2 RFC options.
If we would change the parameter encoding via a value then we would be violating the default or the 2 rfc options right?

The Parameter.EncodingMethod property default is null which means "fall back to the FormatOptions.Default.ParameterEncodingMethod setting". The FOrmatOptions.ParamneterENcodingMethod default is rfc2231 as that is the IETF-accepted standard.

If you set either property to rfc2047, you are technically breaking the standards... BUT most clients will probably handle it and sometimes it might be necessary to break the standards in order to work with broken clients.

I was wondering if we could make it more flexible so that if there is a Custom Encoding method. So maybe another enum value?

I don't think adding a "Quote" or "AlwaysQuote" value to the ParameterEncodingMethod enum makes sense because what we really need is a way to decide the following (pseudocode):

if the parameter value needs to be encoded based on whether it includes non-US-ASCII characters or control characters; then
    if the user has defined a parameter-specific encoding method; then
        use the parameter-specific encoding method
    else
        use the default encoding method
else if the parameter values needs to be quoted based on whether it includes any of a handful of special characters; then
    quote the parameter value
else if the user has specified that the value must be quoted no matter what; then
    quote the parameter value
else
    don't do anything with the parameter value

I considered adding an AlwaysQuote enum value to ParameterEncodingMethod as a bitflag thus allowing something like:

parameter.EncodingMethod = ParameterENcodingMethod.Rfc2047 | ParameterENcodingMethod.Quote;

This way code-wise, we could check for an encoding method (rfc2047 vs rfc2231) and then if it doesn't need encoding, check the Quote bitflag.

The problem with this approach is just that I think it would be confusing to the average developer.

Whereas having a new boolean property would be less confusing.

I think it depends also on a general question, do you want to take the approach to set this on per parameter level so I can for example double quote the filename of content-Disposition but not the filename of content-type or to allow a user (of the library) to set a general custom solution?

Why not both? :)

I'd like to make it work the same way that the ParameterEncoding properties work. I.e. have a Parameter.AlwaysQuote property that, if unset, tells MimeKit to fall back to the FormatOptions setting.

By default, Parameter.AlwaysQuote would be null, so the user could just set the global option and it would affect all parameters.

Or the user could override the default (aka FormatOptions value) for specific parameters (like Content-Disposition filename parameters) by setting it explicitly on the Parameter instance.

@jstedfast
Copy link
Owner

Sorry, my bad, I thought the Parameter.EncodingMethod was nullable but instead I opted to have a ParameterEncodingMethod.Default enum value.

THe logic I explained in my previous comment still applies, this is just a different way of representing a tri-state value.

@jstedfast
Copy link
Owner

I meant to commit that yesterday, but there it is...

Anyway, feedback on the way it's implemented is welcome.

You can use the a build >= 2.12.0.36 from https://www.myget.org/feed/mimekit/package/nuget/MimeKit to test it out.

Leaving this open because I'm still not 100% sure I like those new property names.

@czioutas
Copy link
Author

Hey sorry had to deal with a few things, will take a look promptly!

@jstedfast
Copy link
Owner

I didn't expect an immediate response within 30s, so it's fine :)

@jstedfast
Copy link
Owner

This is now released as part of 2.13.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compatibility Compatibility with existing software enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants