You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(Apologies if this isn't the correct place, I couldn't find anything better.)
Just found http://seriot.ch/parsing_json.php. Great writeup, it's surprising how something so seemingly simple can have so many ways to screw up. I found a few possible improvements:
["\uD800\uD800"] makes some parsers go nuts. R jsonlite yields ["\U00010000"], while Ruby parser yields ["F0908080"]. I still don't get where this value comes from.
Overeager decoding of surrogate pairs. \uD800\uDC00 should yield \U00010000, I guess that one ignores the top 10 bits of the supposed surrogate-low? F0908080 is \U00010000 in UTF-8, again ignoring the top 10 bits.
The text was updated successfully, but these errors were encountered:
(Apologies if this isn't the correct place, I couldn't find anything better.)
Just found http://seriot.ch/parsing_json.php. Great writeup, it's surprising how something so seemingly simple can have so many ways to screw up. I found a few possible improvements:
As of #30, both are i_.
Overeager decoding of surrogate pairs. \uD800\uDC00 should yield \U00010000, I guess that one ignores the top 10 bits of the supposed surrogate-low?
F0908080
is \U00010000 in UTF-8, again ignoring the top 10 bits.The text was updated successfully, but these errors were encountered: