Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass /utf-8 to cl.exe, sanitize encoding #458

Closed
wants to merge 1 commit into from

Conversation

fghzxm
Copy link
Contributor

@fghzxm fghzxm commented May 7, 2019

Use UTF-8 as source and executable encodings for C++ projects that
contain UTF-8 source files.

Tree 35229a7 fails to build where the
default code page is 936 (GBK), due to multiple files containing UTF-8
encoded strings and/or comments, and these projects set to
fail-on-warning mode. Additionally,
src/host/ut_host/UnicodeLiteral.hpp contains a Windows-1252 encoded
string. This commit adds the /utf-8 option to the affected projects,
and converts src/host/ut_host/UnicodeLiteral.hpp to UTF-8.

Use UTF-8 as source and executable encodings for C++ projects that
contain UTF-8 source files.

Tree 35229a7 fails to build where the
default code page is 936 (GBK), due to multiple files containing UTF-8
encoded strings and/or comments, and these projects set to
fail-on-warning mode.  Additionally,
`src/host/ut_host/UnicodeLiteral.hpp` contains a Windows-1252 encoded
string.  This commit adds the `/utf-8` option to the affected projects,
and converts `src/host/ut_host/UnicodeLiteral.hpp` to UTF-8.
@msftclas
Copy link

msftclas commented May 7, 2019

CLA assistant check
All CLA requirements met.

@@ -1,5 +1,5 @@
// Copyright (c) Microsoft Corporation.
// Licensed under the MIT license.

#define REMOTE_STRING L"remote npipe:pipe=foo,server=bar\t"
#define REMOTE_STRING L"remote npipe:pipe=foo,server=bar\t"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these even the right type of quotes? Especially since on the next line, they expect 'normal' quotes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that the quotes are correct. it's possible to ask the console to filter some of the "smart" characters back to their normal ascii equivalents so that they don't break other applications that might not handle them correctly. This particular header is only included in the clipboard unit tests but isn't actually used, I suspect they were part of a test that was removed/lost/not written.

@zadjii-msft
Copy link
Member

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@@ -99,6 +99,15 @@
<ItemDefinitionGroup>
<ClCompile>
<AdditionalIncludeDirectories>..;$(SolutionDir)src\inc;$(SolutionDir)src\inc\test;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<AdditionalOptions Condition="'$(Configuration)|$(Platform)'=='AuditMode|ARM64'">/utf-8 %(AdditionalOptions)</AdditionalOptions>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checking the configuration and platform isn't really necessary for this PR's change since it's the same setting for every combination of configurations and platforms.

@@ -99,6 +99,15 @@
<ItemDefinitionGroup>
<ClCompile>
<AdditionalIncludeDirectories>..;$(SolutionDir)src\inc;$(SolutionDir)src\inc\test;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<AdditionalOptions Condition="'$(Configuration)|$(Platform)'=='AuditMode|ARM64'">/utf-8 %(AdditionalOptions)</AdditionalOptions>
<AdditionalOptions Condition="'$(Configuration)|$(Platform)'=='Debug|ARM64'">/utf-8 %(AdditionalOptions)</AdditionalOptions>
<AdditionalOptions Condition="'$(Configuration)|$(Platform)'=='Release|ARM64'">/utf-8 %(AdditionalOptions)</AdditionalOptions>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have a common props file at /src/common.build.pre.props that already defines some additional options for all of the projects in the solution. It might make more sense to enable utf-8 across the whole repo and standardize all of the files on that instead of mixing encodings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are like 200+ C3437: #pragma execution_character_set is not supported when /source-charset, /execution-charset, or /utf-8 has been specified errors jumping out when /utf-8 is applied globally. Most of them seem to result from macros like TRACELOGGING_DEFINE_PROVIDER and TraceLogging*.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hrmm, I see what you mean. so it sounds like the tracelogging stuff doesn't play nice with the /utf-8 flag. It looks like it's setting the execution_character_set to be utf-8, which we want anyway. I'll need to do some more reading about what's going on here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's kind of self-contradictory that the Windows SDK, even the latest versions, uses a compiler directive that has been deprecated since VS 2015...

@@ -1,5 +1,5 @@
// Copyright (c) Microsoft Corporation.
// Licensed under the MIT license.

#define REMOTE_STRING L"remote npipe:pipe=foo,server=bar\t"
#define REMOTE_STRING L"remote npipe:pipe=foo,server=bar\t"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that the quotes are correct. it's possible to ask the console to filter some of the "smart" characters back to their normal ascii equivalents so that they don't break other applications that might not handle them correctly. This particular header is only included in the clipboard unit tests but isn't actually used, I suspect they were part of a test that was removed/lost/not written.

@aaronfranke
Copy link

Does cl.exe support dashes for arguments? If so, they should be used instead (-utf-8 or --utf-8). Many other Microsoft products have switched to dashes, such as PowerShell and .NET.

@fghzxm
Copy link
Contributor Author

fghzxm commented May 22, 2019

I just built the latest master (6c7dfd2) without encountering encoding issues. I'm not clear if some commit into master has resolved this; if this is the case then I think I can close this.

@DHowett-MSFT
Copy link
Contributor

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@LinHeLurking
Copy link

It still fails with current(2019-06-14) master(d82eab4) on a Chinese version of Windows whose code page is 936(GBK). Issues got solved by passing these utf-8 options to cl. So I think this commit helps non-english users.

@miniksa miniksa mentioned this pull request Jul 11, 2019
5 tasks
@miniksa
Copy link
Member

miniksa commented Jul 11, 2019

I did this in #1929 for the entire project and deleted UnicodeLiteral.hpp because no one was using it. So I'll close this.

@miniksa miniksa closed this Jul 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants