DocumentCode
3407768
Title
On the effect of data reduction on classification accuracy
Author
Ben Meskina, Syrine
Author_Institution
Fac. of Sci. of Tunis, Tunis-El Manar Univ., Tunis, Tunisia
fYear
2013
fDate
24-26 March 2013
Firstpage
1
Lastpage
7
Abstract
Data reduction is an important pre-processing step to both supervised and unsupervised machine learning problems. In this paper, we investigate, in a first part, the two existing strategies for data reduction which are feature selection (FS) and dimensionality reduction (DR). In a second part, we study the impact of different data reduction methods on supervised machine learning in terms of classification accuracy and computational costs. In fact, we compare, in the one hand, the generated subsets of attributes by filter and wrapper algorithms as well as new variables constructed by two variants of a DR method. In the other hand, we compare the classification achieved on initial data set, reduced data sets and also on successively reduced size of the considered data sets.
Keywords
data reduction; pattern classification; unsupervised learning; DR method; classification accuracy; data reduction method; dimensionality reduction; feature selection; supervised machine learning problem; unsupervised machine learning problem; wrapper algorithms; Accuracy; Algorithm design and analysis; Classification algorithms; Data mining; Filtering algorithms; Machine learning algorithms; Principal component analysis; Classification Accuracy; Dimensionality Reduction; Feature Selection; Supervised Machine Learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Technology and e-Services (ICITeS), 2013 3rd International Conference on
Conference_Location
Sousse
Print_ISBN
978-1-4799-0131-9
Type
conf
DOI
10.1109/ICITeS.2013.6624071
Filename
6624071
Link To Document