Title :
Neural Network with K-Means Clustering via PCA for Gene Expression Profile Analysis
Author :
Chen, Thomas C. ; Sanga, Sandeep ; Chou, Tina Y. ; Cristini, Vittorio ; Edgerton, Mary E.
Author_Institution :
Ind. Eng. Dept., Univ. of Houston, Houston, TX, USA
fDate :
March 31 2009-April 2 2009
Abstract :
Gene expression microarray data are highly multidimensional and contain high level of noise. Most of these data involve multiple heterogeneous dynamic patterns depending on disease under study. In addition, possible errors might also be introduced along data collection path if multiple sites and methods are used. In this paper a combined data mining method, i.e., neural network with k-means clustering via principal component analysis (PCA), is proposed to address the data complexity issues when conducting gene expression profile mining. The proposed method was tested on gene expression profile in lung adenocarcinoma, collected from multiple cancer research centers, for survival prediction and risk assessment. The results from the proposed method were analyzed, and further studies for future improvement of the proposed method were also recommended.
Keywords :
data mining; medical computing; neural nets; pattern clustering; principal component analysis; data complexity; data mining method; gene expression microarray data; gene expression profile analysis; k-means clustering; lung adenocarcinoma; multiple heterogeneous dynamic patterns; neural network; principal component analysis; risk assessment; survival prediction; Cancer; Data mining; Diseases; Gene expression; Lungs; Multidimensional systems; Neural networks; Noise level; Principal component analysis; Testing; PCA; clustering analysis; gene expression; k-mean; lung cancer; neural network;
Conference_Titel :
Computer Science and Information Engineering, 2009 WRI World Congress on
Conference_Location :
Los Angeles, CA
Print_ISBN :
978-0-7695-3507-4
DOI :
10.1109/CSIE.2009.945