DocumentCode :
172564
Title :
Designing an Indonesian part of speech tagset and manually tagged Indonesian corpus
Author :
Dinakaramani, Arawinda ; Rashel, Fam ; Luthfi, Andry ; Manurung, Ruli
Author_Institution :
Fac. of Comput. Sci., Univ. Indonesia, Depok, Indonesia
fYear :
2014
fDate :
20-22 Oct. 2014
Firstpage :
66
Lastpage :
69
Abstract :
We describe our work on designing a linguistically principled part of speech (POS) tagset for the Indonesian language. The process involves a detailed study and analysis of existing tagsets and the manual tagging of an Indonesian corpus. The results of this work are an Indonesian POS tagset consisting of 23 tags and an Indonesian corpus of over 250.000 lexical tokens that have been manually tagged using this tagset.
Keywords :
natural language processing; Indonesian POS tagset; Indonesian part-of-speech tagset; POS tagset; linguistically principled part-of-speech; manually tagged Indonesian corpus; Conferences; Context; Manuals; Pragmatics; Speech; Syntactics; Tagging; Indonesian; POS; Part of speech tagset;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Asian Language Processing (IALP), 2014 International Conference on
Conference_Location :
Kuching
Type :
conf
DOI :
10.1109/IALP.2014.6973519
Filename :
6973519
Link To Document :
بازگشت