-
-
Notifications
You must be signed in to change notification settings - Fork 21.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Unicode identifiers in GDScript and Expression #71676
Add support for Unicode identifiers in GDScript and Expression #71676
Conversation
7672caf
to
d2af622
Compare
For what I can tell is already a good improvement so I'm marking as ready to review and merge. We can gather user feedback to see what are the hurdles they find when using this feature on actual projects. |
This is using an adapted version of UAX#31 to not rely on the ICU database (which isn't available in builds without TextServerAdvanced). It allows most characters used in diverse scripts but not everything.
d2af622
to
86ee5f3
Compare
Thanks! 🎉 |
7548e04#diff-0e70e5b10ffb6bcbf01a9c92c0ec2b948011460a2d4ce20ab3443ffbbf9d2de8R588-R595 makes beta16 pretty annoying to use if you previously had a player enum value like The debug error bypasses any project options, and seems to ~silently block recompilation. |
This is using an adapted version of UAX#31 to not rely on the ICU database (which isn't available in builds without TextServerAdvanced). It allows most characters used in diverse scripts but not everything.
This is based on the previous work by @bruvzg on #53956.
This depends highly on #71598, otherwise it might show statements in a misleading order when there are RTL words.
Confusable identifiers: There are two checks for confusable identifiers using the methods of TextServer:
spoof_check()
which checks for mixed characters that can be confusing. E.g.var pοrt
, which uses Greek omicron instead of Latino
). This is checked mostly on declarations and only gives a warning.is_confusable()
against the list of GDScript keywords which checks for visual similarities. E.g.аs
, which uses Cyrillicа
). This gives an error.Those checks only work properly if TextServerAdvanced is available. It is by default on official builds, but may be missing on custom builds (in such case they use the fallback noop from the basic TextServer).
Known issues:
spoof_check()
gives a warning, visually similar identifiers are treated as different names.ç
andç
(first uses regular "ç" character, that other uses a combining sequence "c+◌̧"). Also:Å
andÅ
(first is\u212B
, second is\u00C5
).spoof_check()
).is_confusable()
check against the list of previously declared identifiers, but that can become quite big overtime, especially if we want to compare with engine classes, global script classes, and singleton autoloads, not to mention the inheritance tree of the current script.𝛑
on the screenshot as an example and that creates a confusable identifier warning. I guess this is the mathematical symbol, not the Greek letter, but I'm not sure why a single character would be problematic. This is from the TextServerAdvanced implementation, so any needed change it would be done there, though this is probably accurate to Unicode specification.מִבְחָן
(I just used Google to translatetest
) also gives the confusable warning. I'm not familiar with Hebrew so can't tell for sure if there's an actual issue or if it's a false positive.TextServerAdvanced::is_valid_identifier()
, but those are quite complex to apply, and also needs the ICU data.Closes godotengine/godot-proposals#916
Special thanks to @bruvzg who did most of the groundwork and helped me out with this.