Manually Tagged Indonesian Corpus
Korpus ini menggunakan format tab-separated file (.tsv). Setiap baris berisi token beserta part-of-speech tag dari token tersebut yang terpisahkan oleh satu karakter tab(\t). Antar kalimat dipisahkan oleh satu baris kosong.
Each line consists of token with its respective part-of-speech tag separated by a tab character(\t). There is an empty line between sentences.
- Ruli Manurung
- Arawinda Dinakaramani
- Fam Rashel
- Andry Luthfi
For publication and more details about this work, please visit http://bahasa.cs.ui.ac.id/postag/corpus
This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/4.0/.
This work was carried out under the framework of a research project done at IR-NLP Lab. As there is an initiative to bring together and document all the works done in the IR-NLP Lab, please refer to the IR-NLP Lab's repository for official updates and future versions of this work. This repository will still be available as a personal repository.