DocumentCode
424114
Title
An improved term weighting scheme for vector space model
Author
Sun, Yue-heng ; He, Pi-Lian ; Chen, Zhi-Gang
Author_Institution
Sch. of Electron. Inf. Eng., Tianjin Univ., China
Volume
3
fYear
2004
fDate
26-29 Aug. 2004
Firstpage
1692
Abstract
Document representation has been the fundamental issue in the information retrieval (IR). However, the traditional vector space model (VSM) has data sparseness phenomena on the representation of document vectors, and cannot well discriminate the expression competence to the document content of indexing terms in different positions. This paper proposes an improved term weighting method by introducing information gain of terms while taking above factors into account. The theoretical analysis and experimental results show that the new scheme improves the performance of VSM in IR in terms of higher recall and precision.
Keywords
indexing; information retrieval; data sparseness phenomena; document vector representation; indexing terms; information retrieval; term weighting method; vector space model; Electronic mail; Extraterrestrial phenomena; Frequency; Helium; Indexing; Information retrieval; Optical computing; Performance analysis; Sun; Weight measurement;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Cybernetics, 2004. Proceedings of 2004 International Conference on
Print_ISBN
0-7803-8403-2
Type
conf
DOI
10.1109/ICMLC.2004.1382048
Filename
1382048
Link To Document