Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ICU-20575 fix broken default locale mapping for C.UTF-8 #634

Merged
merged 1 commit into from
Apr 25, 2019

Conversation

srl295
Copy link
Member

@srl295 srl295 commented Apr 25, 2019

  1. LANG=C.UTF-8 would not map to en_US_POSIX but to invalid language c
  2. LANG=aa@BB (and similar) would map to aa__BB__BB
Checklist

@srl295 srl295 requested review from markusicu and jefgen April 25, 2019 17:47
@srl295 srl295 self-assigned this Apr 25, 2019
@srl295
Copy link
Member Author

srl295 commented Apr 25, 2019

Test case: env LANG=C.UTF-8 icuinfo should return:

    <param name="locale.default">en_US_POSIX</param>
    <param name="locale.default.bcp47">en-US-u-va-posix</param>

Before this bug, they return c and und respectively.

@markusicu
Copy link
Member

I really don't like these mappings at this level. There has to be a way to handle "C.UTF-8" in the getDefaultLocale() code.

@jefgen
Copy link
Member

jefgen commented Apr 25, 2019

FWIW, I believe this exacerbated the issue with the DateTimePatternGenerator (ICU-20558), as "C.UTF-8" would be treated as a default locale of root, rather than "en_US_POSIX".

@srl295
Copy link
Member Author

srl295 commented Apr 25, 2019

I really don't like these mappings at this level. There has to be a way to handle "C.UTF-8" in the getDefaultLocale() code.

Markus, makes sense. I'll see what i can do.

@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • icu4c/source/common/putil.cpp is now changed in the branch
  • icu4c/source/common/uloc.cpp is no longer changed in the branch

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • icu4c/source/common/putil.cpp is different

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

@srl295
Copy link
Member Author

srl295 commented Apr 25, 2019

@markusicu PTAL at the updated approach. There was another bug, where aa@bb turned into aa_BB_BB (doubled…). I think C@PREEURO never worked (even back to 4.8). Not that important, but at least we should not overwrite the variant twice and be consistent. (I've updated the bug to mention this)

@markusicu
Copy link
Member

Thanks for fixing it all via putil.cpp!

Is the PR description still accurate?

@srl295
Copy link
Member Author

srl295 commented Apr 25, 2019

@markusicu welcome. the PR is accurate now (and the commit message already was)

icu4c/source/common/putil.cpp Show resolved Hide resolved
// Over-allocate in case we replace "@" with "__".
char *correctedPOSIXLocale = static_cast<char *>(uprv_malloc(uprv_strlen(posixID) + 1 + 1));
// Over-allocate in case we replace "@" with "__" or "C" with "en_US_POSIX"
char *correctedPOSIXLocale = static_cast<char *>(uprv_malloc(uprv_strlen(posixID) + 11 + 1));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI +10 would suffice because en_US_POSIX has length 11 which is 10 longer than C.
Unless I am mis-counting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right.

Regression was in 1afef30
PR unicode-org#418 [ICU-20187]

- We dropped the mapping from "C" in uloc_canonicalize,
  but then putil did not handle cases where a codepage was
  set (such as C.UTF-8).

- Add an additional check in uprv_getDefaultLocaleID() for
  locales that end up as "C" or "POSIX" after removing codepage
  suffix.

- Also fix regression where aa@bb would become aa__BB__BB
  (incorrectly doubled __BB)
@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • icu4c/source/common/putil.cpp is different

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

Copy link
Member

@markusicu markusicu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@srl295 srl295 merged commit 075cefb into unicode-org:master Apr 25, 2019
@srl295 srl295 deleted the fix-ICU-20575 branch April 25, 2019 21:59
srl295 added a commit to srl295/node that referenced this pull request Apr 26, 2019
- Floating patch for ICU 64.x
- includes test case

ICU Bug: https://unicode-org.atlassian.net/browse/ICU-20575
Backport of: unicode-org/icu#634
Fixes: nodejs#27418
targos pushed a commit to nodejs/node that referenced this pull request Apr 27, 2019
- Floating patch for ICU 64.x
- includes test case

ICU Bug: https://unicode-org.atlassian.net/browse/ICU-20575
Backport of: unicode-org/icu#634
Fixes: #27418

PR-URL: #27435
Reviewed-By: Refael Ackermann <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
Reviewed-By: Michaël Zasso <[email protected]>
targos pushed a commit to nodejs/node that referenced this pull request Apr 27, 2019
- Floating patch for ICU 64.x
- includes test case

ICU Bug: https://unicode-org.atlassian.net/browse/ICU-20575
Backport of: unicode-org/icu#634
Fixes: #27418

PR-URL: #27435
Reviewed-By: Refael Ackermann <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
Reviewed-By: Michaël Zasso <[email protected]>
BethGriggs pushed a commit to nodejs/node that referenced this pull request May 10, 2019
- Floating patch for ICU 64.x
- includes test case

ICU Bug: https://unicode-org.atlassian.net/browse/ICU-20575
Backport of: unicode-org/icu#634
Fixes: #27418

PR-URL: #27435
Reviewed-By: Refael Ackermann <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
Reviewed-By: Michaël Zasso <[email protected]>
MylesBorins pushed a commit to nodejs/node that referenced this pull request May 16, 2019
- Floating patch for ICU 64.x
- includes test case

ICU Bug: https://unicode-org.atlassian.net/browse/ICU-20575
Backport of: unicode-org/icu#634
Fixes: #27418

PR-URL: #27435
Reviewed-By: Refael Ackermann <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
Reviewed-By: Michaël Zasso <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants