DocumentCode :
1785225
Title :
Sparse gene expression data analysis based on truncated power
Author :
Ningmin Shen ; Jing Li ; Cheng Jin ; Peiyun Zhou
Author_Institution :
Coll. of Comput. Sci. & Technol., Nanjing Univ. of Aeronaut. & Astronaut., Nanjing, China
fYear :
2014
fDate :
2-5 Nov. 2014
Firstpage :
39
Lastpage :
44
Abstract :
Cluster analysis has become a popular method for gene expression data, which can be used for the diagnosis of diseases accurately and rapidly through the class label. However, more attributes and less samples of gene expression data will produce a mass of redundant or disturbed information, resulting in the decline of the accuracy of the direct clustering acting on high dimensional data. Principal Component Analysis (PCA) is a classical method for dimension reduction which can transform high dimension data into low space. The shortcoming of PCA is the lack of strong interpretation because the loadings have no characteristic of sparsity. In this paper, a sparse PCA method based on Truncated Power, which can minimizes the cardinality of loadings as well as maximizes the percentage explained variances of principal components (PCs), was applied into the feature extraction method for gene expression, then the sparse PCs was fed into K-means process for clustering. Finally, the experimental results on three typical gene datasets verify that the sparse gene data can improve the efficiency and accuracy on clustering analysis.
Keywords :
bioinformatics; data analysis; feature extraction; genetics; pattern clustering; principal component analysis; K-means process; PCA; cluster analysis; dimension reduction; direct clustering; disease diagnosis; disturbed information; feature extraction method; gene datasets; high-dimension data transform; principal component analysis; redundant information; sparse gene expression data analysis; truncated power; Cancer; Colon; Correlation; Feature extraction; Gene expression; Loading; Principal component analysis; Gene expression data; Truncated Power; feature extraction; sparse principal component analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Bioinformatics and Biomedicine (BIBM), 2014 IEEE International Conference on
Conference_Location :
Belfast
Type :
conf
DOI :
10.1109/BIBM.2014.6999385
Filename :
6999385
Link To Document :
بازگشت