DocumentCode
605948
Title
Optimization for Vietnamese text classification problem by reducing features set
Author
Ha Nguyen Thi Thu ; Quynh Nguyen Huu ; Khanh Nguyen Thi Hong ; Hung Le Manh
Author_Institution
Dept. of Comput. Sci., Vietnam Electr. Power Univ., Hanoi, Vietnam
fYear
2012
fDate
23-25 Oct. 2012
Firstpage
209
Lastpage
212
Abstract
Vietnamese is the single syllable language, so that process of word segmentation is relatively complex, if split word based on whitespaces, it is not accuracy, on the other hand Vietnamese segmentation tools are not high effective. In this paper, we propose a new method that used only topic word for calculating to increase accuracy of the Vietnameses text classification system and optimize the process of calculating. The experimental results show that our method more effective than the proposed approach, higher accuracy and reduce the computational complexity.
Keywords
classification; computational complexity; natural language processing; optimisation; text analysis; word processing; Vietnamese segmentation tools; Vietnamese text classification problem; Vietnamese text classification system accuracy calculation; calculation process optimization; computational complexity reduction; feature set reduction; split word; topic word; whitespaces; word segmentation; Vietnamese text classification; feature set reduction; syllable language; topic word;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Science and Service Science and Data Mining (ISSDM), 2012 6th International Conference on New Trends in
Conference_Location
Taipei
Print_ISBN
978-1-4673-0876-2
Type
conf
Filename
6528628
Link To Document