• DocumentCode
    3407768
  • Title

    On the effect of data reduction on classification accuracy

  • Author

    Ben Meskina, Syrine

  • Author_Institution
    Fac. of Sci. of Tunis, Tunis-El Manar Univ., Tunis, Tunisia
  • fYear
    2013
  • fDate
    24-26 March 2013
  • Firstpage
    1
  • Lastpage
    7
  • Abstract
    Data reduction is an important pre-processing step to both supervised and unsupervised machine learning problems. In this paper, we investigate, in a first part, the two existing strategies for data reduction which are feature selection (FS) and dimensionality reduction (DR). In a second part, we study the impact of different data reduction methods on supervised machine learning in terms of classification accuracy and computational costs. In fact, we compare, in the one hand, the generated subsets of attributes by filter and wrapper algorithms as well as new variables constructed by two variants of a DR method. In the other hand, we compare the classification achieved on initial data set, reduced data sets and also on successively reduced size of the considered data sets.
  • Keywords
    data reduction; pattern classification; unsupervised learning; DR method; classification accuracy; data reduction method; dimensionality reduction; feature selection; supervised machine learning problem; unsupervised machine learning problem; wrapper algorithms; Accuracy; Algorithm design and analysis; Classification algorithms; Data mining; Filtering algorithms; Machine learning algorithms; Principal component analysis; Classification Accuracy; Dimensionality Reduction; Feature Selection; Supervised Machine Learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Technology and e-Services (ICITeS), 2013 3rd International Conference on
  • Conference_Location
    Sousse
  • Print_ISBN
    978-1-4799-0131-9
  • Type

    conf

  • DOI
    10.1109/ICITeS.2013.6624071
  • Filename
    6624071