Title :
Study on topic tracking system based on KNN
Author :
Li, Shengdong ; Lv, Xueqiang ; Liu, Dong ; Shi, Shuicai
Author_Institution :
Chinese Inf. Process. Res. Center, Beijing Inf. Sci. & Technol. Univ., Beijing, China
Abstract :
Text classification is the key technology for topic tracking, and vector space model (VSM) is one of the most simple and effective topics representation model. Feature selection algorithm in VSM is an important means of data pre-processing, and it can reduce vector space dimension and improve the generalization ability of the algorithm. Therefore, it is necessary for feature selection algorithms to be in-depth and extensive research. So we develop a topic tracking system to study how feature dimension and the value of K-neighbors affect topic tracking. Then we get the variation law that they affect topic tracking, and add up their optimal values in topic tracking. Finally, TDT evaluation methods prove that optimal topic tracking performance based on adjusting the value of K-neighbors for text increases by 7.246% more than feature dimension.
Keywords :
optimisation; pattern classification; text analysis; KNN; VSM; data preprocessing; feature selection algorithm; optimal values; text classification; topic tracking system; vector space model; Classification algorithms; Signal processing; Signal processing algorithms; Support vector machine classification; Text categorization; Training; information gain; knn; tdt evaluation; topic tracking;
Conference_Titel :
Signal Processing Systems (ICSPS), 2010 2nd International Conference on
Conference_Location :
Dalian
Print_ISBN :
978-1-4244-6892-8
Electronic_ISBN :
978-1-4244-6893-5
DOI :
10.1109/ICSPS.2010.5555204