Title :
Improving Fusion of Dimensionality Reduction Methods for Nearest Neighbor Classification
Author :
Deegalla, Sampath ; Bostrom, Henrik
Author_Institution :
Dept. of Comput. & Syst. Sci., Stockholm Univ., Kista, Sweden
Abstract :
In previous studies, performance improvement of nearest neighbor classification of high dimensional data, such as microarrays, has been investigated using dimensionality reduction. It has been demonstrated that the fusion of dimensionality reduction methods, either by fusing classifiers obtained from each set of reduced features, or by fusing all reduced features are better than using any single dimensionality reduction method. However, none of the fusion methods consistently outperform the use of a single dimensionality reduction method. Therefore, a new way of fusing features and classifiers is proposed, which is based on searching for the optimal number of dimensions for each considered dimensionality reduction method. An empirical evaluation on microarray classification is presented, comparing classifier and feature fusion with and without the proposed method, in conjunction with three dimensionality reduction methods; principal component analysis (PCA), partial least squares (PLS) and information gain (IG). The new classifier fusion method outperforms the previous in 4 out of 8 cases, and is on par with the best single dimensionality reduction method. The novel feature fusion method is however outperformed by the previous method, which selects the same number of features from each dimensionality reduction method. Hence, it is concluded that the idea of optimizing the number of features separately for each dimensionality reduction method can only be recommended for classifier fusion.
Keywords :
data reduction; pattern classification; principal component analysis; dimensionality reduction methods; feature fusion method; high dimensional data; information gain; microarray classification; nearest neighbor classification; partial least squares; principal component analysis; Application software; Cancer; Gene expression; High performance computing; Least squares methods; Machine learning; Medical treatment; Nearest neighbor searches; Optimization methods; Principal component analysis; classifier fusion; dimensionality reduction; feature fusion; microarrays; nearest neighbor classification;
Conference_Titel :
Machine Learning and Applications, 2009. ICMLA '09. International Conference on
Conference_Location :
Miami Beach, FL
Print_ISBN :
978-0-7695-3926-3
DOI :
10.1109/ICMLA.2009.95