• DocumentCode
    525685
  • Title

    Utilizing Category Relevancy Factor for text categorization

  • Author

    Maleki, Mina

  • Author_Institution
    Iran Telecommun. Res. Center, Tehran, Iran
  • fYear
    2010
  • fDate
    23-25 June 2010
  • Firstpage
    334
  • Lastpage
    339
  • Abstract
    One of the main preprocessing steps for having a high performance text classifier is feature weighting. Commonly used feature weighting methods such as TF and IDF-based methods only consider the distribution of a feature in the document(s) and do not consider class information for feature weighting. In this paper, we present TFCRF (Term Frequency and Category Relevancy Factor) method in which the weight of features depends on their power to discriminate the classes from each other by using class information. The results show significant improvement in the performance of SVM algorithm by using TFCRF feature weighting method in comparison to the other implemented standard feature weighting methods.
  • Keywords
    data mining; learning (artificial intelligence); pattern classification; support vector machines; text analysis; SVM algorithm; class information; feature weighting; term frequency and category relevancy factor; text categorization; text classifier; Computational modeling; Frequency; Information retrieval; Kernel; Standards development; Support vector machine classification; Support vector machines; Text categorization; Text mining; Tuning; Feature weighting; SVM; Text categorization; Text mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Software Engineering and Data Mining (SEDM), 2010 2nd International Conference on
  • Conference_Location
    Chengdu
  • Print_ISBN
    978-1-4244-7324-3
  • Electronic_ISBN
    978-89-88678-22-0
  • Type

    conf

  • Filename
    5542899