• DocumentCode
    2735586
  • Title

    A Weight-based Feature Extraction Approach for Text Classification

  • Author

    Jiang, Jung-Yi ; Lee, Shie-Jue

  • Author_Institution
    Nat. Sun Yat-Sen Univ., Kaohsiung
  • fYear
    2007
  • fDate
    5-7 Sept. 2007
  • Firstpage
    164
  • Lastpage
    164
  • Abstract
    In this paper, we propose a weight-based feature extraction approach to reduce the number of features for text classification. The number of extracted features is equal to the number of document classes and the feature values are obtained according to the distributions of words over class partitions. Each word of the original word set contributes a weight to each extracted feature and a transformation matrix is formed. By using the transformation matrix, the original document set is converted to a new set with a smaller number of features. The proposed approach has two advantages. Trial-and-error for determining the appropriate number of extracted features can be avoided. Computation demand is small and the method runs fast. Experimental results obtained from real-world data sets have shown that our method can perform better than other methods.
  • Keywords
    feature extraction; matrix algebra; text analysis; text classification; transformation matrix; trial-and-error; weight-based feature extraction approach; Classification algorithms; Clustering methods; Computational efficiency; Data mining; Data processing; Feature extraction; Gain measurement; Matrix converters; Performance gain; Text categorization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Innovative Computing, Information and Control, 2007. ICICIC '07. Second International Conference on
  • Conference_Location
    Kumamoto
  • Print_ISBN
    0-7695-2882-1
  • Type

    conf

  • DOI
    10.1109/ICICIC.2007.109
  • Filename
    4427809