Title :
Kernel PCA regression for missing data estimation in DNA microarray analysis
Author :
Shan, Ying ; Deng, Guang
Author_Institution :
Dept. of Electron. Eng., La Trobe Univ., Bundoora, VIC, Australia
Abstract :
The DNA microarray may contain missing expression data. Estimation of missing values is a necessary step in microarray analysis, because data mining procedures require a complete expression as their input. In this paper, we propose a missing data estimation algorithm, named KPCAimpute, based on kernel principal component analysis. We consider a family of heavy-tailed kernel functions, which is a generalization of the famous Gaussian kernel. The performance of the proposed KPCAimpute algorithm is compared with two state-of-the-art linear regression methods, i.e., Bayesian principal component analysis imputation (BPCA) and local least squares imputation (LLSimpute). The KPCAimpute outperforms the LL-Simpute when the missing percentage increases. The performance of the KPCAimpute is similar to that of the BPCA imputation. Therefore, it is an effective and promising algorithm in estimating missing values for DNA microarray profiles.
Keywords :
Bayes methods; DNA; data mining; principal component analysis; regression analysis; Bayesian principal component analysis imputation; DNA microarray analysis; Gaussian kernel; KPCAimpute algorithm; data mining procedures; kernel principal component analysis regression; linear regression methods; local least squares imputation; missing data estimation; Bayesian methods; Clustering algorithms; DNA; Data analysis; Gene expression; Kernel; Least squares methods; Linear regression; Principal component analysis; Vectors;
Conference_Titel :
Circuits and Systems, 2009. ISCAS 2009. IEEE International Symposium on
Conference_Location :
Taipei
Print_ISBN :
978-1-4244-3827-3
Electronic_ISBN :
978-1-4244-3828-0
DOI :
10.1109/ISCAS.2009.5118046