-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Unicode codepoints #217
Conversation
@daumayr hmmm, is there a good reason why we have a |
@daumayr can we debate the current design, and restrictions? I think, it would be less restrictive if |
I would argue we want to have something similar to this:
Possibly with a check that |
@smarr
The methods from Character could be moved to String. |
Ok good, I should document the Dart preference as part of a style guide in the docs, I think. And, I probably need to document that character literals don't support long unicode code points (like for emojis). |
@daumayr could you have a look at this again and see whether it satisfies all your requirements? @Richard-Roberts might be interesting to you. We now got more robust unicode support. |
looks good |
- added KernelObj.signalExceptionWithClass helper method - avoids code duplication and is explicit about what is done Signed-off-by: Stefan Marr <[email protected]>
Since our integers can encode all unicode characters, it would be sad, if we don’t actually use it. This implementation extracts some bits of java.lang.Character to enable Truffle-base specialization. - added tests - turn Character class>>from: to String class>>fromCodepoint: - also fix implementation, make sure specializastions are not overlapping - added tests for argument error - make sure we have specialization for Symbol>>#charAt: - otherwise, it falls back to the substring-based implementation in String when it failed with SSymbol objects - fixed literal parsing, became relevant for unicode character literals - use @fallback instead of @specialization without guard - fallback implies negation, specialization without guard is just the most generic case Signed-off-by: Stefan Marr <[email protected]>
Signed-off-by: Stefan Marr <[email protected]>
This includes general suggestions on errors vs. exceptions. Signed-off-by: Stefan Marr <[email protected]>
These changes introduce support for converting characters, i.e., elements in a string into their Integer representation in terms of Unicode codepoints. It also introduces support for creating characters from Integers representing codepoints.
Overview:
ArgumentError
, i.e., violations expectations on method argument@daumayr, you didn't say anything with respect to the
ArgumentError
naming question.Which system should we use as general style guide, Java, Dart, something else?
ArgumentError
is not in Newspeak.TODOs
char
when converting length-1 Strings to number. How to deal with characters that are 2x 16-bit, i.e., how to support the full unicode code point range?/docs/