• DocumentCode
    424114
  • Title

    An improved term weighting scheme for vector space model

  • Author

    Sun, Yue-heng ; He, Pi-Lian ; Chen, Zhi-Gang

  • Author_Institution
    Sch. of Electron. Inf. Eng., Tianjin Univ., China
  • Volume
    3
  • fYear
    2004
  • fDate
    26-29 Aug. 2004
  • Firstpage
    1692
  • Abstract
    Document representation has been the fundamental issue in the information retrieval (IR). However, the traditional vector space model (VSM) has data sparseness phenomena on the representation of document vectors, and cannot well discriminate the expression competence to the document content of indexing terms in different positions. This paper proposes an improved term weighting method by introducing information gain of terms while taking above factors into account. The theoretical analysis and experimental results show that the new scheme improves the performance of VSM in IR in terms of higher recall and precision.
  • Keywords
    indexing; information retrieval; data sparseness phenomena; document vector representation; indexing terms; information retrieval; term weighting method; vector space model; Electronic mail; Extraterrestrial phenomena; Frequency; Helium; Indexing; Information retrieval; Optical computing; Performance analysis; Sun; Weight measurement;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics, 2004. Proceedings of 2004 International Conference on
  • Print_ISBN
    0-7803-8403-2
  • Type

    conf

  • DOI
    10.1109/ICMLC.2004.1382048
  • Filename
    1382048