DocumentCode :
2971208
Title :
Reducing High-Dimensional Data by Principal Component Analysis vs. Random Projection for Nearest Neighbor Classification
Author :
Deegalla, Sampath ; Boström, Henrik
Author_Institution :
Dept. of Comput. & Syst. Sci., Stockholm Univ., Kista
fYear :
2006
fDate :
Dec. 2006
Firstpage :
245
Lastpage :
250
Abstract :
The computational cost of using nearest neighbor classification often prevents the method from being applied in practice when dealing with high-dimensional data, such as images and micro arrays. One possible solution to this problem is to reduce the dimensionality of the data, ideally without loosing predictive performance. Two different dimensionality reduction methods, principle component analysis (PCA) and random projection (RP), are investigated for this purpose and compared w.r.t. the performance of the resulting nearest neighbor classifier on five image data sets and five micro array data sets. The experiment results demonstrate that PCA outperforms RP for all data sets used in this study. However, the experiments also show that PCA is more sensitive to the choice of the number of reduced dimensions. After reaching a peak, the accuracy degrades with the number of dimensions for PCA, while the accuracy for RP increases with the number of dimensions. The experiments also show that the use of PCA and RP may even outperform using the non-reduced feature set (in 9 respectively 6 cases out of 10), hence not only resulting in more efficient, but also more effective, nearest neighbor classification
Keywords :
data mining; pattern classification; principal component analysis; PCA; high-dimensional data reduction; nearest neighbor classification; principal component analysis; random projection; Computational complexity; Computational efficiency; Decision trees; Discrete cosine transforms; Image analysis; Learning systems; Nearest neighbor searches; Performance analysis; Principal component analysis; Supervised learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Applications, 2006. ICMLA '06. 5th International Conference on
Conference_Location :
Orlando, FL
Print_ISBN :
0-7695-2735-3
Type :
conf
DOI :
10.1109/ICMLA.2006.43
Filename :
4041499
Link To Document :
بازگشت