DocumentCode :
3205819
Title :
Reducing the influence of normalization on data classification
Author :
Salama, Mostafa A. ; Hassanien, Aboul Ella ; Fahmy, Aly A.
Author_Institution :
Fac. of Comput. & Inf., British Univ. in Egypt, Cairo, Egypt
fYear :
2010
fDate :
8-10 Oct. 2010
Firstpage :
609
Lastpage :
613
Abstract :
Principle Component Analysis (PCA) received a lot of attention over the past years and it is considered as a preprocessing method before many data mining models. PCA depends on the assumption that the input is normally distributed which is not true in many real life cases. On the other hand applying normalization on the input could change the structure of data and then affecting the outcome of multivariate analysis and calibration used in data mining. This paper introduces the effect of normalization methods before applying the conventional PCA. And It declares that the correlation PCA that uses the correlation matrix in PCA method could avoid such requirement. It proves that the correlation PCA leads to a better classification performance when the appropriate number of components is selected. The results also show that the resulted classification performance is independent on the normality of input.
Keywords :
data handling; data mining; principal component analysis; correlation PCA; data classification; data mining; multivariate analysis; normalization method; principle component analysis; Accuracy; Computers; Correlation; Covariance matrix; Data mining; Feature extraction; Principal component analysis; Classification; Features; normalization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Information Systems and Industrial Management Applications (CISIM), 2010 International Conference on
Conference_Location :
Krackow
Print_ISBN :
978-1-4244-7817-0
Type :
conf
DOI :
10.1109/CISIM.2010.5643523
Filename :
5643523
Link To Document :
بازگشت