• DocumentCode
    3448897
  • Title

    Research of English text classification methods based on semantic meaning

  • Author

    Lv, Lin ; Liu, Yu-Shu

  • Author_Institution
    Sch. of Comput. Sci. & Technol., Beijing Inst. of Technol.
  • fYear
    2005
  • fDate
    5-6 Dec. 2005
  • Firstpage
    689
  • Lastpage
    700
  • Abstract
    To overcome the limitations of traditional text classification approaches based on bag-of-words representation and to effectively incorporate linguistic knowledge and conceptual index into text vector space representation, based on WordNet thesaurus and latent semantic indexing (LSI) model, combinative method of them is presented to realize naive Bayes text classification and simple vector distance text classification, and five groups of contrastive experiments are made respectively. The results show that the accuracy rates of the two text classification methods are both gradually advanced along with more and more in-depth semantic analysis, which indicates that semantic mining is very important and necessary to text classification. The comparative analysis of the related work is also given
  • Keywords
    classification; indexing; natural languages; text analysis; thesauri; English text classification; WordNet thesaurus; latent semantic indexing; naive Bayes text classification; semantic meaning; text vector space representation; Data mining; Feature extraction; Indexing; Large scale integration; Space technology; Speech analysis; Tagging; Text categorization; Thesauri; Viterbi algorithm; LSI; Naïve Bayes; Semantic Meaning; Simple Vector Distance; WordNet;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information and Communications Technology, 2005. Enabling Technologies for the New Knowledge Society: ITI 3rd International Conference on
  • Conference_Location
    Cairo
  • Print_ISBN
    0-7803-9270-1
  • Type

    conf

  • DOI
    10.1109/ITICT.2005.1609660
  • Filename
    1609660