Skip to content

Commit

Permalink
Updated Thai with new characters
Browse files Browse the repository at this point in the history
  • Loading branch information
dmort27 committed Jun 6, 2022
1 parent d0ada5a commit a68d73b
Show file tree
Hide file tree
Showing 2 changed files with 37 additions and 0 deletions.
3 changes: 3 additions & 0 deletions epitran/data/map/tha-Thai.csv
Original file line number Diff line number Diff line change
Expand Up @@ -74,3 +74,6 @@ Orth,Phon
ัวะ,ua̯
ัว,uːa̯
ʔ,ʔ
ๅ,ː
ฤ,rʉ
ำ,am
34 changes: 34 additions & 0 deletions epitran/data/pre/tha-Thai.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,37 @@

% Thanthakhat (cp. Virama)
.์ -> 0 / _

% Delete numerals
๐ -> 0 / _ % THAI DIGIT ZERO
๑ -> 0 / _ % THAI DIGIT ONE
๒ -> 0 / _ % THAI DIGIT TWO
๓ -> 0 / _ % THAI DIGIT THREE
๔ -> 0 / _ % THAI DIGIT FOUR
๕ -> 0 / _ % THAI DIGIT FIVE
๖ -> 0 / _ % THAI DIGIT SIX
๗ -> 0 / _ % THAI DIGIT SEVEN
๘ -> 0 / _ % THAI DIGIT EIGHT
๙ -> 0 / _ % THAI DIGIT NINE

% Delete pintu
ฺ -> 0 / _ % THAI CHARACTER PHINTHU

% Delete tones
่ -> 0 / _ % THAI CHARACTER MAI EK
๋ -> 0 / _ % THAI CHARACTER MAI CHATTAWA
๊ -> 0 / _ % THAI CHARACTER MAI TRI

% Delete archaic and exceptional
ํ -> 0 / _ % THAI CHARACTER NIKHAHIT
์ -> 0 / _ % THAI CHARACTER THANTHAKHAT


% Delete reduplication mark
ๆ -> 0 / _ % THAI CHARACTER MAIYAMOK

% Delete abbreviation marker
ฯ -> 0 / _ % THAI CHARACTER PAIYANNOI

% Delete short mark (should be handled differently)
็ -> 0 / _ % THAI CHARACTER MAITAIKHU

0 comments on commit a68d73b

Please sign in to comment.