Title :
Comparing the impacts of dimension reduction methods that use class labels on text classification
Author_Institution :
Bilgisayar Muhendisligi Bolumu, Yildiz Tek. Univ., Istanbul, Turkey
Abstract :
Classification of datasets that contain samples with numerous features is known as a costly process in time and space. In order to overcome this problem, dimensionality reduction techniques like feature selection and feature extraction are proposed in literature. In this paper, we compare the impacts of abstract feature extraction method and other popular techniques that use class labels for dimensionality reduction on classification performances. For evaluation, we utilize two standard text datasets having high dimensional samples. We compare the impacts of selected methods on performance by applying them on selected datasets and testing on five different classifiers with different design approaches. Results show that using abstract feature extraction method for dimensionality reduction produces much better classification performance, when compared with other selected methods.
Keywords :
character recognition; feature extraction; text analysis; abstract feature extraction method; class label; dataset classification; dimension reduction method; standard text dataset; text classification; Abstracts; Feature extraction; Kernel; Machine learning; Semantics; Support vector machines; Text categorization;
Conference_Titel :
Signal Processing and Communications Applications Conference (SIU), 2012 20th
Conference_Location :
Mugla
Print_ISBN :
978-1-4673-0055-1
Electronic_ISBN :
978-1-4673-0054-4
DOI :
10.1109/SIU.2012.6204613