Title :
A text classification model based on training sample selection and feature weight adjustement
Author :
Pang, Xuezeng ; Yixing Liao
Author_Institution :
Dept. of Comput. Sci. & Technol., Zhejiang Univ., Hangzhou, China
Abstract :
A new text classification model based on training samples selection and feature weight adjustment is presented. First it computes representativeness score of samples so as to distinguish noise samples from original training samples. Then a feature weight adjustment taking inter-class distribution and intra-class distribution into consideration is used to further improve the performance of text classification. The presented text classification model is applied on Chinese text dataset provided by Fudan Database Center. The experiments show that the proposed model can improve the performance of text classification to some extent with fewer training samples and fewer feature dimensions.
Keywords :
database management systems; pattern classification; text analysis; Chinese text dataset; Fudan database center; feature weight adjustement; interclass distribution; intraclass distribution; text classification model; training sample selection; Computational efficiency; Computer science; Degradation; Finance; Frequency; Internet; Iterative methods; Paper technology; Spatial databases; Text categorization; feature weight adjustment; representativeness score; text classification; training dataset selection;
Conference_Titel :
Advanced Computer Control (ICACC), 2010 2nd International Conference on
Conference_Location :
Shenyang
Print_ISBN :
978-1-4244-5845-5
DOI :
10.1109/ICACC.2010.5486615