Consider performing complete Unicode extension canonicalization per RFC6067 #14

anba · 2018-02-06T18:00:01Z

That means sorting all keys and attributes per https://tools.ietf.org/html/rfc6067#section-2.1.1.

Like, we're already halfway there (deduplication and sorting of known keys), so maybe we should also take the last step into full canonicalization per RFC6067?

And maybe also removing duplicate attributes which are considered irrelevant per https://tools.ietf.org/html/rfc6067#section-2.1, so we handle them consistent compared to duplicate keys:

Only the first occurrence of an attribute or key conveys meaning in a
language tag. When interpreting tags containing the Unicode locale
extension, duplicate attributes or keywords are ignored in the
following way: ignore any attribute that has already appeared in the
tag and ignore any keyword whose key has already occurred in the tag.

littledan · 2018-02-07T22:21:28Z

Seeing how #13 turned out, sorting all the tags sounds like the right approach to me. A big benefit will be that, as we add more options that are recognized, the output of new Intl.Locale(...).toString() doesn't change--we just always canonicalize. I'm inclined to merge a patch which makes this change unless I hear downsides. The main reason I didn't specify this already was because I didn't know what the definition of canonicalizing Unicode extension keys was.

cc @jungshik @zbraniecki @nciric

anba mentioned this issue Feb 6, 2018

Various spec fixes #13

Merged

anba mentioned this issue Feb 12, 2018

Canonicalize Unicode extensions #21

Merged

littledan closed this as completed in #21 Apr 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider performing complete Unicode extension canonicalization per RFC6067 #14

Consider performing complete Unicode extension canonicalization per RFC6067 #14

anba commented Feb 6, 2018

littledan commented Feb 7, 2018

Consider performing complete Unicode extension canonicalization per RFC6067 #14

Consider performing complete Unicode extension canonicalization per RFC6067 #14

Comments

anba commented Feb 6, 2018

littledan commented Feb 7, 2018