DocumentCode :
3699165
Title :
A novel feature selection based on Tibetan grammar for Tibetan text classification
Author :
Tao Jiang;Hongzhi Yu
Author_Institution :
State Key Laboratory of National Languages Information Technology, Northwest Univercity for Nationalities, Lanzhou, Gansu, P.R. China
fYear :
2015
Firstpage :
445
Lastpage :
448
Abstract :
Feature selection is a strategy that aims at making text classifiers more efficient and accurate. In this paper, we proposed a novel feature selection method based on Tibetan grammar for Tibetan classification. Tibetan language express grammatical meaning through the function words and word order, and the function word has large proportions. By analyzing the Tibetan grammar and distribution of part of speech, we proposed feature selection method based on Tibetan notional words. The method analyzed the part of speech of Tibetan text, and then used notional words as text features combined with IG method to realize feature selection. The experimental result shows that this method has improved significantly on classification efficiency and accuracy which compared with the traditional feature selection methods.
Keywords :
"Text categorization","Speech","Grammar","Semantics","Accuracy","Classification algorithms","Tagging"
Publisher :
ieee
Conference_Titel :
Software Engineering and Service Science (ICSESS), 2015 6th IEEE International Conference on
ISSN :
2327-0586
Print_ISBN :
978-1-4799-8352-0
Electronic_ISBN :
2327-0594
Type :
conf
DOI :
10.1109/ICSESS.2015.7339093
Filename :
7339093
Link To Document :
بازگشت