Title :
An Extensive Empirical Study of Feature Selection for Text Categorization
Author :
Qiu, Li-Qing ; Zhao, Ru-Yi ; Zhou, Gang ; Yi, Sheng-Wei
Author_Institution :
State Key Lab. of Software Dev. Environ., Beihang Univ., Beihang
Abstract :
We present a novel feature selection (FS) approach for text categorization. It first constructs a local feature set for each category by selecting a set of features based on three different schemes: DF, TF and TFIDF, and then constructs a global feature set utilizing well-known CHI method based on the local feature set. The experimental comparison is carried out between our method and CHI method. Results from the experiments are summarized. The results show that our proposed method based on DF scheme can perform comparatively well with CHI methods.
Keywords :
learning (artificial intelligence); pattern classification; text analysis; CHI methods; feature selection; local feature set; text categorization; Character generation; Computational efficiency; Frequency measurement; Gain measurement; Information analysis; Information science; Performance gain; Programming; Space technology; Text categorization; Text Categorization; feature selection;
Conference_Titel :
Computer and Information Science, 2008. ICIS 08. Seventh IEEE/ACIS International Conference on
Conference_Location :
Portland, OR
Print_ISBN :
978-0-7695-3131-1
DOI :
10.1109/ICIS.2008.49