Skip to content
This repository has been archived by the owner on Jan 25, 2022. It is now read-only.

Consider performing complete Unicode extension canonicalization per RFC6067 #14

Closed
anba opened this issue Feb 6, 2018 · 1 comment · Fixed by #21
Closed

Consider performing complete Unicode extension canonicalization per RFC6067 #14

anba opened this issue Feb 6, 2018 · 1 comment · Fixed by #21

Comments

@anba
Copy link
Contributor

anba commented Feb 6, 2018

That means sorting all keys and attributes per https://tools.ietf.org/html/rfc6067#section-2.1.1.

Like, we're already halfway there (deduplication and sorting of known keys), so maybe we should also take the last step into full canonicalization per RFC6067?

And maybe also removing duplicate attributes which are considered irrelevant per https://tools.ietf.org/html/rfc6067#section-2.1, so we handle them consistent compared to duplicate keys:

Only the first occurrence of an attribute or key conveys meaning in a
language tag. When interpreting tags containing the Unicode locale
extension, duplicate attributes or keywords are ignored in the
following way: ignore any attribute that has already appeared in the
tag and ignore any keyword whose key has already occurred in the tag.

@anba anba mentioned this issue Feb 6, 2018
@littledan
Copy link
Member

Seeing how #13 turned out, sorting all the tags sounds like the right approach to me. A big benefit will be that, as we add more options that are recognized, the output of new Intl.Locale(...).toString() doesn't change--we just always canonicalize. I'm inclined to merge a patch which makes this change unless I hear downsides. The main reason I didn't specify this already was because I didn't know what the definition of canonicalizing Unicode extension keys was.

cc @jungshik @zbraniecki @nciric

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants