Title :
Vietnamese Noun Phrase Chunking Based on Conditional Random Fields
Author :
Thao, Nguyen Thi Huong ; Thai, Nguyen Phuong ; Nguyen Le Minh ; Thuy, Ha Quang
Author_Institution :
Coll. of Technolgy, Vietnam Nat. Univ., Hanoi, Vietnam
Abstract :
Noun phrase chunking is an important and useful task in many natural language processing applications. It is studied well for English, however with Vietnamese it is still an open problem. This paper presents a Vietnamese noun phrase chunking approach based on conditional random fields (CRFs) models. We also describe a method to build Vietnamese corpus from a set of hand annotated sentences. For evaluation, we perform several experiments using different feature settings. Outcome results on our corpus show a high performance with the average of recall and precision 82.72% and 82.62% respectively.
Keywords :
natural language processing; random processes; text analysis; Vietnamese corpus; Vietnamese noun phrase chunking; conditional random field model; hand annotated sentences; natural language processing; Data mining; Educational institutions; Knowledge engineering; Machine learning; Natural language processing; Natural languages; Performance evaluation; Systems engineering and theory; Terminology; Writing;
Conference_Titel :
Knowledge and Systems Engineering, 2009. KSE '09. International Conference on
Conference_Location :
Hanoi
Print_ISBN :
978-1-4244-5086-2
Electronic_ISBN :
978-0-7695-3846-4
DOI :
10.1109/KSE.2009.43