Title :
An improved term weighting scheme for vector space model
Author :
Sun, Yue-heng ; He, Pi-Lian ; Chen, Zhi-Gang
Author_Institution :
Sch. of Electron. Inf. Eng., Tianjin Univ., China
Abstract :
Document representation has been the fundamental issue in the information retrieval (IR). However, the traditional vector space model (VSM) has data sparseness phenomena on the representation of document vectors, and cannot well discriminate the expression competence to the document content of indexing terms in different positions. This paper proposes an improved term weighting method by introducing information gain of terms while taking above factors into account. The theoretical analysis and experimental results show that the new scheme improves the performance of VSM in IR in terms of higher recall and precision.
Keywords :
indexing; information retrieval; data sparseness phenomena; document vector representation; indexing terms; information retrieval; term weighting method; vector space model; Electronic mail; Extraterrestrial phenomena; Frequency; Helium; Indexing; Information retrieval; Optical computing; Performance analysis; Sun; Weight measurement;
Conference_Titel :
Machine Learning and Cybernetics, 2004. Proceedings of 2004 International Conference on
Print_ISBN :
0-7803-8403-2
DOI :
10.1109/ICMLC.2004.1382048