Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Keyman language highlighting #609

Merged
merged 6 commits into from
Jun 28, 2015
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions components.js
Original file line number Diff line number Diff line change
Expand Up @@ -185,6 +185,10 @@ var components = {
"title": "Julia",
"owner": "cdagnino"
},
"keyman": {
"title": "Keyman",
"owner": "mcdurdin"
},
"latex": {
"title": "LaTeX",
"owner": "japborst"
Expand Down
14 changes: 14 additions & 0 deletions components/prism-keyman.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
Prism.languages.keyman = {
'comment': /\b[cC]\s.*?(\r?\n|$)/,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can simplify the regexp by using the i flag.
Also, I think you can remove the last part ?(\r?\n|$) since the dot . does not match line feeds. (Unless you intentionally wanted to consume the line feed?)

'function': /\[\s*((CTRL|SHIFT|ALT|LCTRL|RCTRL|LALT|RALT)\s+)*(([T|K|U]_[a-z0-9_]+)|(".+")|('.+'))\s*\]/i, // virtual key
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you meant [TKU] instead of [T|K|U], even though I can't find any usage of T or U in the documentation...

Furthermore, if I'm reading this properly, you don't need the three pairs of inner parentheses in the last part: ([T|K|U]_[a-z0-9_]+|".+"|'.+').

Also, it looks like you forgot the CAPS and NCAPS shift codes.

Finally, the documentation mentions key codes like K_KANJI?15 which do not appear to be handled here (due to the ? char)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

T, U are referenced in http://help.keyman.com/developer/9.0/guides/touch-layouts/creating-a-touch-keyboard-layout-for-amharic-the-nitty-gritty.php.

Also will tweak quoted strings (non-greedy match).

Aware that ordering virtual keys ahead of strings will format "[K_A]" wrongly but in practice this would never be used so not worth supporting.

'string': /("|')((?!\1).)*\1/,
'keyword': /\b(any|beep|call|context|deadkey|dk|if|index|notany|nul|outs|return|reset|save|set|store|use)\b/i, // rule keywords
'atrule': /\b(ansi|begin|unicode|group|(using keys)|match|nomatch)\b/i, // structural keywords
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parentheses are not needed around using keys.

'bold': [ // header statements
/\&(baselayout|bitmap|capsononly|capsalwaysoff|shiftfreescaps|copyright|ethnologuecode|hotkey|includecodes|keyboardversion|kmw_embedcss|kmw_embdjs|kmw_helpfile|kmw_helptext|kmw_rtl|language|layer|layoutfile|message|mnemoniclayout|name|oldcharposmatching|platform|targets|version|visualkeyboard|windowslanguages)\b/i,
/\b(bitmap|bitmaps|(caps on only)|(caps always off)|(shift frees caps)|copyright|hotkey|language|layout|message|name|version)\b/i
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parentheses are not needed around caps on only, caps always off and shift frees caps

],
'number': /\b(([uU]\+[\dA-Fa-f]+)|([dD]\d+)|([xX][\dA-Fa-f]+)|([0-7]+))\b/, // U+####, x###, d### characters and numbers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use the i flag to simplify the regexp.
Also, the inner pairs of parentheses around each alternative are not needed.

'operator': /[+>\\,\(\)]/,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to escape ( and ) inside a character class.

'tag': /\$(keyman|kmfl|silkey|keymanweb|keymanonly):/i // prefixes
};
101 changes: 101 additions & 0 deletions examples/prism-keyman.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
<h1>Keyman</h1>
<p>To use this language, use the class "language-keyman".</p>

<h2>Comments</h2>
<pre><code>c This is a comment</code></pre>

<h2>Strings, numbers and characters</h2>
<pre><code>"'this' is a string"
'and so is "this"'
U+0041 d65 x41 c these are all the letter A
</code></pre>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should add small examples for virtual keys and prefixes, since there don't seem to be any in the large example below.

<h2>Example Code</h2>
<pre><code>c =====================Begin Identity Section===================================================
c
c Mnemonic input method for Amharic script on US-QWERTY
c keyboards for Keyman version 7.1, compliant with Unicode 4.1 and later.
c

store(&amp;VERSION) '9.0'
store(&amp;Name) "Amharic"
c store(&amp;MnemonicLayout) "1"
store(&amp;CapsAlwaysOff) "1"
store(&amp;Copyright) "Creative Commons Attribution 3.0"
store(&amp;Message) "This is an Amharic language mnemonic input method for Ethiopic script that requires Unicode 4.1 support."
store(&amp;WINDOWSLANGUAGES) 'x045E x045E'
store(&amp;LANGUAGE) 'x045E'
store(&amp;EthnologueCode) "amh"
store(&amp;VISUALKEYBOARD) 'gff-amh-7.kvk'
store(&amp;KMW_EMBEDCSS) 'gff-amh-7.css'
HOTKEY "^%A"
c
c =====================End Identity Section=====================================================

c =====================Begin Data Section=======================================================

c ---------------------Maps for Numbers---------------------------------------------------------
store(ArabOnes) '23456789'
store(ones) '፪፫፬፭፮፯፰፱'
store(tens) '፳፴፵፶፷፸፹፺'
store(arabNumbers) '123456789'
store(ethNumbers) '፩፪፫፬፭፮፯፰፱፲፳፴፵፶፷፸፹፺፻፼'
store(arabNumbersWithZero) '0123456789'
store(ColonOrComma) ':,'
store(ethWordspaceOrComma) '፡፣'
c ---------------------End Numbers--------------------------------------------------------------

c =====================End Data Section=========================================================

c =====================Begin Functional Section=================================================
c
store(&amp;LAYOUTFILE) 'gff-amh-7_layout.js'
store(&amp;BITMAP) 'amharic.bmp'
store(&amp;TARGETS) 'any windows'
begin Unicode > use(main)
group(main) using keys

c ---------------------Input of Numbers---------------------------------------------------------

c Special Rule for Arabic Numerals
c
c The following attempts to auto-correct the use of Ethiopic wordspace and
c Ethiopic comma within an Arabic numeral context. Ethiopic wordspace gets
c used erroneously in time formats and Ethiopic commas as an order of thousands
c delimiter. The correction context is not known until numerals appear on _both_
c sides of the punctuation.
c
any(arabNumbersWithZero) any(ethWordspaceOrComma) + any(arabNumbers) > index(arabNumbersWithZero,1) index(ColonOrComma,2) index(arabNumbers,3)

c Ethiopic Numerals

"'" + '1' > '፩'
"'" + any(ArabOnes) > index(ones,2)

c special cases for multiples of one
'፩' + '0' > '፲'
'፲' + '0' > '፻'
'፻' + '0' > '፲፻'
'፲፻' + '0' > '፼'
'፼' + '0' > '፲፼'
'፲፼' + '0' > '፻፼'
'፻፼' + '0' > '፲፻፼'
'፲፻፼' + '0' > '፼፼'
'፼፼' + '0' > context beep c do not go any higher, we could beep here

c upto the order of 100 million
any(ones) + '0' > index(tens,1)
any(tens) + '0' > index(ones,1) '፻' c Hundreds
any(ones) '፻ '+ '0' > index(tens,1) '፻' c Thousands
any(tens) '፻' + '0' > index(ones,1) '፼' c Ten Thousands
any(ones) '፼' + '0' > index(tens,1) '፼' c Hundred Thousands
any(tens) '፼' + '0' > index(ones,1) '፻፼' c Millions
any(ones) '፻፼' + '0' > index(tens,1) '፻፼' c Ten Millions
any(tens) '፻፼' + '0' > index(ones,1) '፼፼' c Hundred Millions

c enhance this later, look for something that can copy a match over
any(ethNumbers) + any(arabNumbers) > index(ethNumbers,1) index(ethNumbers,2)
c ---------------------End Input of Numbers-----------------------------------------------------

c =====================End Functional Section===================================================
</code></pre>