Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parsing_json.php improvements #75

Open
Alcaro opened this issue Nov 27, 2017 · 0 comments
Open

parsing_json.php improvements #75

Alcaro opened this issue Nov 27, 2017 · 0 comments

Comments

@Alcaro
Copy link

Alcaro commented Nov 27, 2017

(Apologies if this isn't the correct place, I couldn't find anything better.)

Just found http://seriot.ch/parsing_json.php. Great writeup, it's surprising how something so seemingly simple can have so many ways to screw up. I found a few possible improvements:

i_string_iso_latin_1.json | ["E9"]
n_string_invalid_utf-8.json | ["FF"]

As of #30, both are i_.

["\uD800\uD800"] makes some parsers go nuts. R jsonlite yields ["\U00010000"], while Ruby parser yields ["F0908080"]. I still don't get where this value comes from.

Overeager decoding of surrogate pairs. \uD800\uDC00 should yield \U00010000, I guess that one ignores the top 10 bits of the supposed surrogate-low? F0908080 is \U00010000 in UTF-8, again ignoring the top 10 bits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant