Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.NETStandard, wrong codepage used in TNEF extractor #370

Closed
andrvo opened this issue Jan 29, 2018 · 5 comments
Closed

.NETStandard, wrong codepage used in TNEF extractor #370

andrvo opened this issue Jan 29, 2018 · 5 comments
Labels
compatibility Compatibility with existing software

Comments

@andrvo
Copy link

andrvo commented Jan 29, 2018

Hi,

I've been trying to run TNEF extractor in dotnetcore2 environment when faced strange problem. HTML body of any message with codepage different from 1252 came out broken.

Debugger shows that GetMessageEncoding().GetString(bytes, 0, length) in TnefPropertyReader throws NullReferenceException, that causes fallback to DefaultEncoding.GetString(bytes, 0, length), e.g. to use default encoding, which is 1252.

If I replace Encoding.GetEncoding (codepage, new EncoderExceptionFallback(), new DecoderExceptionFallback()) in GetMessageEncoding() with Encoding.GetEncoding (codepage) - things get back to normal.

How to repeat:

  • create new netcore2 unit test project
  • copy existing TestExtractedCharset() to the new project
  • remove incompatible stuff, it wouldn't build as is. we actually need only Assert.AreEqual(expected, html)
  • run it, assert fails

I didn't find proper way to fix this. GetString never supposed to throw NullReference of course. On the other hand, I tried to pull exact test data and repeat misbehavior of GetString in clean project - no luck, it works fine, no exceptions, decoded correctly.

dotnet --version
2.1.4
Windows 10

@jstedfast
Copy link
Owner

correct me if I'm wrong, but that sounds like a bug in netcore2's Encoding.GetEncoding() implementation, right?

Could you submit a bug report to them to fix this?

Thanks.

@jstedfast
Copy link
Owner

It sounds like Encoding.GetEncoding (codepage, new EncoderExceptionFallback(), new DecoderExceptionFallback()) is returning null which it's not ever supposed to do.

If it can't find an appropriate encoding to return, it is supposed to throw an exception.

Can you verify this diagnosis?

@andrvo
Copy link
Author

andrvo commented Jan 29, 2018

Yes, it looks exactly as netcore bug, I tried to submit report there, I wrote a very simple sample, just three lines of code:

            var bytes = System.IO.File.ReadAllBytes(@"c:\temp\nullbytes.bin");
            Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
            var enc = Encoding.GetEncoding(1251, new EncoderExceptionFallback(), new DecoderExceptionFallback());
            return enc.GetString(bytes, 0, bytes.Length);

And I feed exactly the same bytes to this fragment, it works. So I ended up here.

GetEncoding doesn't return null, checked. Exception come from GetString, for sure.

Thanks.

jstedfast added a commit that referenced this issue Jan 29, 2018
Works around an NRE bug in netcore2 - issue #370
@jstedfast
Copy link
Owner

We probably don't need the exception fallbacks, so since that worked for you, I've committed a work-around.

Hopefully the real bug gets fixed in netcore2 tho.

@jstedfast jstedfast added the compatibility Compatibility with existing software label Jan 29, 2018
@andrvo
Copy link
Author

andrvo commented Jan 29, 2018

Cool, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compatibility Compatibility with existing software
Projects
None yet
Development

No branches or pull requests

2 participants