Title :
A novel feature selection based on Tibetan grammar for Tibetan text classification
Author :
Tao Jiang;Hongzhi Yu
Author_Institution :
State Key Laboratory of National Languages Information Technology, Northwest Univercity for Nationalities, Lanzhou, Gansu, P.R. China
Abstract :
Feature selection is a strategy that aims at making text classifiers more efficient and accurate. In this paper, we proposed a novel feature selection method based on Tibetan grammar for Tibetan classification. Tibetan language express grammatical meaning through the function words and word order, and the function word has large proportions. By analyzing the Tibetan grammar and distribution of part of speech, we proposed feature selection method based on Tibetan notional words. The method analyzed the part of speech of Tibetan text, and then used notional words as text features combined with IG method to realize feature selection. The experimental result shows that this method has improved significantly on classification efficiency and accuracy which compared with the traditional feature selection methods.
Keywords :
"Text categorization","Speech","Grammar","Semantics","Accuracy","Classification algorithms","Tagging"
Conference_Titel :
Software Engineering and Service Science (ICSESS), 2015 6th IEEE International Conference on
Print_ISBN :
978-1-4799-8352-0
Electronic_ISBN :
2327-0594
DOI :
10.1109/ICSESS.2015.7339093