• DocumentCode
    3634594
  • Title

    A symmetric term weighting scheme for text categorization based on term occurrence probabilities

  • Author

    Zafer Erenel;Hakan Altinçay;Ekrem Varoğlu

  • Author_Institution
    Department of Computer Engineering, Eastern Mediterranean University, Famagusta, North Cyprus
  • fYear
    2009
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    Term weighting schemes used in text categorization can be considered as functions of term occurence probabilities in positive and negative classes. In this paper, widely used weighting schemes are firstly evaluated from this perspective. Then, a novel feature weighting scheme based on term occurrence probabilities is proposed. Experiments conducted using SVM classifier on the Reuters-21578 ModApte Top10 dataset shows that the proposed method outperforms other well known measures such as CHI, IG, OR and RF in terms of macro-F1 and micro-F1 scores.
  • Keywords
    "Text categorization","Weight measurement","Frequency measurement","Support vector machines","Radio frequency","Support vector machine classification","Gain measurement","Extraterrestrial measurements","Logic","Robustness"
  • Publisher
    ieee
  • Conference_Titel
    Soft Computing, Computing with Words and Perceptions in System Analysis, Decision and Control, 2009. ICSCCW 2009. Fifth International Conference on
  • Print_ISBN
    978-1-4244-3429-9
  • Type

    conf

  • DOI
    10.1109/ICSCCW.2009.5379438
  • Filename
    5379438