Support counting Asian characters double for line-breaking purposes #21

remkop · 2017-02-06T12:13:18Z

See argparse4j

enable if { "ja", "zh", "ko" }.contains(Locale.getDefault().getLanguage())

Apply to Unicode characters having East Asian Width property Wide/Full/Ambiguous.

remkop · 2017-02-14T08:19:49Z

http://unicode.org/reports/tr11/
https://ja.wikipedia.org/wiki/%E6%9D%B1%E3%82%A2%E3%82%B8%E3%82%A2%E3%81%AE%E6%96%87%E5%AD%97%E5%B9%85
http://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt

remkop · 2019-03-19T06:41:06Z

From lanterna TerminalTextUtils:

    /**
     * Given a character, is this character considered to be a CJK character?
     * Shamelessly stolen from
     * <a href="http://stackoverflow.com/questions/1499804/how-can-i-detect-japanese-text-in-a-java-string">StackOverflow</a>
     * where it was contributed by user Rakesh N
     * @param c Character to test
     * @return {@code true} if the character is a CJK character
     *
     */
    public static boolean isCharCJK(final char c) {
        Character.UnicodeBlock unicodeBlock = Character.UnicodeBlock.of(c);
        return (unicodeBlock == Character.UnicodeBlock.HIRAGANA)
                || (unicodeBlock == Character.UnicodeBlock.KATAKANA)
                || (unicodeBlock == Character.UnicodeBlock.KATAKANA_PHONETIC_EXTENSIONS)
                || (unicodeBlock == Character.UnicodeBlock.HANGUL_COMPATIBILITY_JAMO)
                || (unicodeBlock == Character.UnicodeBlock.HANGUL_JAMO)
                || (unicodeBlock == Character.UnicodeBlock.HANGUL_SYLLABLES)
                || (unicodeBlock == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS)
                || (unicodeBlock == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A)
                || (unicodeBlock == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_B)
                || (unicodeBlock == Character.UnicodeBlock.CJK_COMPATIBILITY_FORMS)
                || (unicodeBlock == Character.UnicodeBlock.CJK_COMPATIBILITY_IDEOGRAPHS)
                || (unicodeBlock == Character.UnicodeBlock.CJK_RADICALS_SUPPLEMENT)
                || (unicodeBlock == Character.UnicodeBlock.CJK_SYMBOLS_AND_PUNCTUATION)
                || (unicodeBlock == Character.UnicodeBlock.ENCLOSED_CJK_LETTERS_AND_MONTHS)
                || (unicodeBlock == Character.UnicodeBlock.HALFWIDTH_AND_FULLWIDTH_FORMS && c < 0xFF61);    //The magic number here is the separating index between full-width and half-width
    }

    /**
     * Checks if a character is expected to be taking up two columns if printed to a terminal. This will generally be
     * {@code true} for CJK (Chinese, Japanese and Korean) characters.
     * @param c Character to test if it's double-width when printed to a terminal
     * @return {@code true} if this character is expected to be taking up two columns when printed to the terminal,
     * otherwise {@code false}
     */
    public static boolean isCharDoubleWidth(final char c) {
        return isCharCJK(c);
    }

remkop · 2019-04-13T01:07:41Z

Should add a setting to switch this off.

doubleCountCJKCharacters
detectWideCJKCharacters
adjustLineBreaksForWideCJKCharacters

remkop modified the milestone: 0.2.0 usage online help Feb 6, 2017

remkop added the type: enhancement ✨ label Feb 6, 2017

remkop modified the milestones: 0.3.0 usage online help, 0.5.0 advanced option parsing Mar 18, 2017

remkop modified the milestones: 0.5.0 advanced option parsing, 0.6.0 polishing Mar 31, 2017

remkop modified the milestones: 0.5.0 advanced option parsing, 0.6.0 polishing Apr 12, 2017

remkop mentioned this issue Feb 19, 2019

Integration/collaboration with Lanterna #633

Open

remkop modified the milestones: backlog, 4.0 Mar 19, 2019

remkop modified the milestones: 4.0, 4.0-alpha-2 Apr 12, 2019

remkop closed this as completed in e7cbd3d Apr 12, 2019

remkop reopened this Apr 13, 2019

remkop closed this as completed in e55f62f Apr 13, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support counting Asian characters double for line-breaking purposes #21

Support counting Asian characters double for line-breaking purposes #21

remkop commented Feb 6, 2017

remkop commented Feb 14, 2017 •

edited

Loading

remkop commented Mar 19, 2019

remkop commented Apr 13, 2019

Support counting Asian characters double for line-breaking purposes #21

Support counting Asian characters double for line-breaking purposes #21

Comments

remkop commented Feb 6, 2017

remkop commented Feb 14, 2017 • edited Loading

remkop commented Mar 19, 2019

remkop commented Apr 13, 2019

remkop commented Feb 14, 2017 •

edited

Loading