-
-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TNEF: Save original codepage for future reference #357
Conversation
What do you need the charset for? The message body? If so, adding it as a message header is the wrong way to go about solving this. It would be better to specify it directly on the individual MIME part. |
It's a bit more complicated. If I try to parse TNEF message from Office 365 written in Ukrainian, I see the following: MimeKit extracts TNEF body HTML using That is correct, exactly as TNEF specification requires. And once I present this HTML to the user, he can see garbage, Cyrillic symbols are broken. Ok, I open HTML and look inside. It has meta tag inside, with different charset. So I have to do the following (tp is TextPart):
After that And if I understand the specification correctly, OemCodepage attribute is message-wide, in terms of TNEF. |
Do you think it would be possible to create one of these tnef attachments (with safe to publish publicly content) so that I can add it to my unit tests as well as playing around with it to try and find a nice solution? I'm thinking the nicest solution, assuming I can both make it work and if it makes sense (which it sounds like it does?), is to automatically tag the TextPart with the OemCharset encoding for you, so that when you get the .Text property, it's already converted for you. |
Can you try the patch that I just committed above? That will set the Also note that since a Currently the stream used with the |
You could also look into using my Once you start playing with my |
Patch looks good, thanks! I didn't try it yet, but certainly it will work in my case. So far I'm working on concept of solution, performance issues coming a bit later. But they will come :) I'll try to get some test message on Monday. |
Ok, cool. I'll close this as fixed then. If you can get me a sample tnef attachment that I can use for testing, that would be awesome. Feel free to send that to [email protected] |
Office 365 sends quite odd TNEF messages. It often sets OemCodepage attribute to 1251 and specifies real charset of HTML body in meta tag.
It is not a problem of MimeKit of course, but life can be much easier if ConvertToMessage() method would indicate codepage value it used to extract particular binary property (HTML body of the message).