Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not encode unstructured headers if not needed #885

Closed
wants to merge 1 commit into from

Conversation

klinki
Copy link

@klinki klinki commented Feb 17, 2023

Hello,

this PR updates handling of unstructured headers to better follow RFC2822
which states the following statement:

Some field bodies in this standard are defined simply as
"unstructured" (which is specified below as any US-ASCII characters,
except for CR and LF) with no further restrictions. These are
referred to as unstructured field bodies. Semantically, unstructured
field bodies are simply to be treated as a single line of characters
with no further processing (except for header "folding" and
"unfolding" as described in section 2.2.3).

Reason for this PR is some email clients are unable to process for example the Unsubscribe-List header with Q-Encoded values (for example Thunderbird and eM Client).

Encoding unstructured headers by Rfc2047 only when they contain non-ASCII characters fixes this problem.

Follow RFC2822 https://www.rfc-editor.org/rfc/rfc2822#section-2.2.1
Semantically, unstructured field bodies are simply to be treated as a single line of characters with no further processing.

Encode unstructured headers by Rfc2047 only when they contain non-ASCII characters.
@jstedfast
Copy link
Owner

This isn't quite right.

Keep in mind that rfc822 (and all of its updates including rfc2822 and rfc5322) ignore MIME encoding rules which supersede the rules in these RFCs.

From rfc2047:

[5](https://www.rfc-editor.org/rfc/rfc2047#section-5). Use of encoded-words in message headers

   An 'encoded-word' may appear in a message header or body part header
   according to the following rules:

(1) An 'encoded-word' may replace a 'text' token (as defined by [RFC 822](https://www.rfc-editor.org/rfc/rfc822))
    in any Subject or Comments header field, any extension message
    header field, or any MIME body part field for which the field body
    is defined as '*text'.  An 'encoded-word' may also appear in any
    user-defined ("X-") message or body part header field.

    Ordinary ASCII text and 'encoded-word's may appear together in the
    same header field.  However, an 'encoded-word' that appears in a
    header field defined as '*text' MUST be separated from any adjacent
    'encoded-word' or 'text' by 'linear-white-space'.

(2) An 'encoded-word' may appear within a 'comment' delimited by "(" and
    ")", i.e., wherever a 'ctext' is allowed.  More precisely, the [RFC](https://www.rfc-editor.org/rfc/rfc822)
    [822](https://www.rfc-editor.org/rfc/rfc822) ABNF definition for 'comment' is amended as follows:

    comment = "(" *(ctext / quoted-pair / comment / encoded-word) ")"

    A "Q"-encoded 'encoded-word' which appears in a 'comment' MUST NOT
    contain the characters "(", ")" or "
    'encoded-word' that appears in a 'comment' MUST be separated from
    any adjacent 'encoded-word' or 'ctext' by 'linear-white-space'.

    It is important to note that 'comment's are only recognized inside
    "structured" field bodies.  In fields whose bodies are defined as
    '*text', "(" and ")" are treated as ordinary characters rather than
    comment delimiters, and rule (1) of this section applies.  (See [RFC](https://www.rfc-editor.org/rfc/rfc822)
    [822](https://www.rfc-editor.org/rfc/rfc822), sections [3.1.2](https://www.rfc-editor.org/rfc/rfc2047#section-3.1.2) and [3.1.3](https://www.rfc-editor.org/rfc/rfc2047#section-3.1.3))

In order to implement a correct solution, we need to refer to rfc2369 which defines the syntax for the List-Subscribe/Unsubscribe/Archive/Help/Owner/etc headers.

Being that they have a definite syntax, they are not unstructured headers :-)

That said, you are probably correct in that MimeKit's current logic is not doing the right thing for these headers (even if only because it is treating them as unstructured headers).

jstedfast added a commit that referenced this pull request Feb 18, 2023
Specifically, this improves handling of List-Archive, List-Help, List-Post,
List-Subscribe, and List-Unsubscribe.

Note: This does not change the handling of List-Id or List-Unsubscribe-Post.

Fixes issue #885
@jstedfast jstedfast closed this Feb 18, 2023
@klinki
Copy link
Author

klinki commented Feb 20, 2023

Ok, thanks for fixing the List-Unsubscribe headers :)

@klinki klinki deleted the hotfix/headers branch February 28, 2023 10:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants