DocumentCode :
3407768
Title :
On the effect of data reduction on classification accuracy
Author :
Ben Meskina, Syrine
Author_Institution :
Fac. of Sci. of Tunis, Tunis-El Manar Univ., Tunis, Tunisia
fYear :
2013
fDate :
24-26 March 2013
Firstpage :
1
Lastpage :
7
Abstract :
Data reduction is an important pre-processing step to both supervised and unsupervised machine learning problems. In this paper, we investigate, in a first part, the two existing strategies for data reduction which are feature selection (FS) and dimensionality reduction (DR). In a second part, we study the impact of different data reduction methods on supervised machine learning in terms of classification accuracy and computational costs. In fact, we compare, in the one hand, the generated subsets of attributes by filter and wrapper algorithms as well as new variables constructed by two variants of a DR method. In the other hand, we compare the classification achieved on initial data set, reduced data sets and also on successively reduced size of the considered data sets.
Keywords :
data reduction; pattern classification; unsupervised learning; DR method; classification accuracy; data reduction method; dimensionality reduction; feature selection; supervised machine learning problem; unsupervised machine learning problem; wrapper algorithms; Accuracy; Algorithm design and analysis; Classification algorithms; Data mining; Filtering algorithms; Machine learning algorithms; Principal component analysis; Classification Accuracy; Dimensionality Reduction; Feature Selection; Supervised Machine Learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Technology and e-Services (ICITeS), 2013 3rd International Conference on
Conference_Location :
Sousse
Print_ISBN :
978-1-4799-0131-9
Type :
conf
DOI :
10.1109/ICITeS.2013.6624071
Filename :
6624071
Link To Document :
بازگشت