DocumentCode
2379143
Title
Improving Vietnamese POS tagging by integrating a rich feature set and Support Vector Machines
Author
Nghiem, Minh ; Dinh, Dien ; Nguyen, Mai
Author_Institution
Fac. of Inf. Technol., Univ. of Sci., Ho Chi Minh City
fYear
2008
fDate
13-17 July 2008
Firstpage
128
Lastpage
133
Abstract
Part of speech (POS) tagging is fundamental in natural language processing. So far, many methods have been applied for English and the task is well solved. However, there are few studies about this problem for Vietnamese. In this paper, we evaluate common features for English POS tagging and then propose some language specific features for Vietnamese POS tagging. Experimental results on the Vietnamese Lexicography Centerpsilas research grouppsilas corpus show that our POS tagger using this feature set trained by SVM outperforms other Vietnamese POS taggers.
Keywords
natural language processing; speech processing; support vector machines; Vietnamese language; language specific features; natural language processing; part of speech tagging; rich feature set; support vector machines; Feature extraction; Hidden Markov models; Information technology; Machine learning; Natural language processing; Natural languages; Speech processing; Support vector machine classification; Support vector machines; Tagging; Natural Language Processing; Part of Speech Tagging; Support Vector Machines;
fLanguage
English
Publisher
ieee
Conference_Titel
Research, Innovation and Vision for the Future, 2008. RIVF 2008. IEEE International Conference on
Conference_Location
Ho Chi Minh City
Print_ISBN
978-1-4244-2379-8
Electronic_ISBN
978-1-4244-2380-4
Type
conf
DOI
10.1109/RIVF.2008.4586344
Filename
4586344
Link To Document