Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'ß' is replaced through 'ss' before transcoding in punycode #801

Closed
samuraev opened this issue Jun 3, 2022 · 19 comments
Closed

'ß' is replaced through 'ss' before transcoding in punycode #801

samuraev opened this issue Jun 3, 2022 · 19 comments
Labels
compatibility Compatibility with existing software dotnet-bug A bug in the .NET runtime or class libraries.

Comments

@samuraev
Copy link

samuraev commented Jun 3, 2022

Describe the bug

I noticed that by using MailKit, sending emails from UTF8 recipients is not working as expected. Exactly to say, "ß" would be firstly replaced into "ss" and then transcoded with punycode. And it is quite different domain. MailFrom and HeaderFrom converted with "ss" and not with "ß" as expected.

e.g.: strauß-blümen.test should be xn--strau-blmen-z6a49a.test but coded as xn--strauss-blmen-5ob.test from strauss-blümen.test

Platform:
OS: Windows Server 2016
.NET Framework: 4.8
MailKit Version: 3.2.0

To Reproduce
Steps to reproduce the behaviour:
send an email from address with help of MailKit which contains "ß" e.g: test@staruß-blümen.test

Thank you and best regards,
Alex

@ekalchev
Copy link

ekalchev commented Jun 3, 2022

My wild guess is that this is server issue. Why don't you try your test with another mail server see if the problem still occurs?

@samuraev
Copy link
Author

samuraev commented Jun 3, 2022

Thank you for the quick answer. I think that the problem is on MimeKit level and depends on which framework is used: i use .Net Framework 4.8. I see wrong coded address before sending an email in Visual Studio in Mime Message. As I have heard .Net Core does not have this problem.

I try to create a small example project. Maybe it helps better.

@jstedfast
Copy link
Owner

I'm on a Mac at the moment, but testing this out quickly in the Mono v6.12.0 C# REPL, this is what I get:

csharp> var idn = new System.Globalization.IdnMapping ()
csharp> idn.GetAscii ("strauß-blümen.test")
"xn--strau-blmen-z6a49a.test"
csharp>

This seems to match what you expect.

I'll jump to Windows and test in LINQPad.

@jstedfast
Copy link
Owner

Okay, I'm on my Windows box now...

Running the following statements in LINQPad 7:

var idn = new System.Globalization.IdnMapping ();
var ascii = idn.GetAscii ("strauß-blümen.test");
Console.WriteLine (ascii);
  • .NET 6.0.4: xn--strau-blmen-z6a49a.test
  • .NET 5.0.17: xn--strau-blmen-z6a49a.test
  • .NET Core 3.1: xn--strauss-blmen-5ob.test

With LINQPad 5 (needed for .NET 4.x):

  • .NET 4.x: xn--strauss-blmen-5ob.test

@jstedfast
Copy link
Owner

This seems to be a bug in the .NET Framework :-(

@jstedfast
Copy link
Owner

Interesting: https://www.unicode.org/reports/tr46/#Table_Deviation_Characters

For some reason .NET Framework is using IDNA2003 instead of IDNA2008.

@henning-krause
Copy link

@jstedfast We're using this code in the .NET Framework 4.8 to support conversion of domains containing an 'ß':

public static class SafeIdnMapping
{
	private const int IdnRawPunycode = 8;

	public static string ToUnicode([NotNull] string input)
	{
		if (input == null) throw new ArgumentNullException(nameof(input));
		var buffer = ArrayPool<char>.Shared.Rent(input.Length*2);

		try
		{
			var result = SafeNativeMethods.IdnToUnicode(IdnRawPunycode, input, input.Length, buffer, buffer.Length);
			if (result == 0) throw new Win32Exception();

			return new string(buffer, 0, result);
		}
		finally
		{
			ArrayPool<char>.Shared.Return(buffer);
		}
	}
}

Perhaps you could use this code to support this scenario in MailKit?

@jstedfast
Copy link
Owner

Probably not because it would mean that MailKit would no longer be cross-platform (you seem to be P/Invoking into a win32 dll)

@henning-krause
Copy link

Yes, I'm using P/Invoke here. You could use conditional compilation for this...

#if NET48
// Code specific to net48
#else
// standard code
#end

@henning-krause
Copy link

By the way, this is the P/Invoke signature:

[SecurityCritical]
[SuppressUnmanagedCodeSecurity]
[DllImport("kernel32.dll", CharSet = CharSet.Unicode, SetLastError = true)]
public static extern int IdnToUnicode(
	uint dwFlags,
	[MarshalAs(UnmanagedType.LPWStr), In] string lpASCIICharStr,
	int cchASCIIChar,
	[Out] char[] lpUnicodeCharStr,
	int cchUnicodeChar);

@jstedfast
Copy link
Owner

You could use conditional compilation for this...

I could, but that doesn't solve the cross-platform issue.

@henning-krause
Copy link

Wouldn't it? On all platforms except .NEt 4.8, the code would use the .NET implementation and on .NET 4.8 (and lower) it could use the Windows P/Invoke method. Since .NET 4.8 only runs on Windows, I don't see the issue.

What am I missing here?

@jstedfast
Copy link
Owner

You are forgetting about Mono on MacOS and Linux.

@henning-krause
Copy link

Hmm, okay.

I'm not sure where in the code you are doing the IDN mapping. Perhaps you could add an extension point where we could hook in?

@henning-krause
Copy link

I just had a quick look at it. It seems that the IdnMapping is only used in two places in the MailKit solution. If I create a PR with an extension point, would you consider integrating it?

@jstedfast
Copy link
Owner

What is your proposed API for this?

@henning-krause
Copy link

henning-krause commented Jun 12, 2022

The IdnMapping is currently being used in to classes: SmtpClient and ParseUtils.

My proposal is to wrap existing IdnMapping class in an interface, provide a default implementation and access it via a mutable singleton:

public interface IIdnMapper
{
	string GetAscii (string unicode);
	string GetUnicode (string ascii);
}

public class IdnMapper : IIdnMapper
{
	public static IIdnMapper Instance = new IdnMapper ();

	static readonly IdnMapping idn = new IdnMapping ();

	public string GetAscii (string unicode)
	{
		return idn.GetAscii (unicode);
	}

	public string GetUnicode (string ascii)
	{
		return idn.GetUnicode (ascii);
	}
}

We could then implement our own implementation of that interface and use the native interop without compromising the platform independence of the library itself.

jstedfast referenced this issue Jun 12, 2022
MailboxAddress.IdnMapping is a new static property that can be overridden
to use any punycode implementation that implements the new IPunycode
interface.

Part of the fix for https://github.com/jstedfast/MailKit/issues/1387
@jstedfast jstedfast transferred this issue from jstedfast/MailKit Jun 12, 2022
@jstedfast jstedfast changed the title MailKit: 'ß' is replaced through 'ss' before transcoding in punycode 'ß' is replaced through 'ss' before transcoding in punycode Jun 12, 2022
jstedfast added a commit that referenced this issue Jun 12, 2022
MailboxAddress.IdnMapping is a new static property that can be overridden
to use any punycode implementation that implements the new IPunycode
interface.

Fixes issue #801
@jstedfast jstedfast added dotnet-bug A bug in the .NET runtime or class libraries. compatibility Compatibility with existing software labels Jun 12, 2022
@jstedfast
Copy link
Owner

I think this is completely solved just doing this for MimeKit.

SmtpClient's static .ctor uses the IdnMapping class to get a default domain name for the local machine to use in EHLO commands. But this string can be manually overridden via a property on the SmtpClient.

@jstedfast
Copy link
Owner

MimeKit v3.4.0 has been released with this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compatibility Compatibility with existing software dotnet-bug A bug in the .NET runtime or class libraries.
Projects
None yet
Development

No branches or pull requests

4 participants