DocumentCode :
1478784
Title :
Identifying Gene Pathways Associated with Cancer Characteristics via Sparse Statistical Methods
Author :
Kawano, Shuichi ; Shimamura, Teppei ; Niida, Atsushi ; Imoto, Seiya ; Yamaguchi, Rui ; Nagasaki, Masao ; Yoshida, Ryo ; Print, Cristin ; Miyano, Satoru
Author_Institution :
Dept. of Math. Sci., Osaka Prefecture Univ., Sakai, Japan
Volume :
9
Issue :
4
fYear :
2012
Firstpage :
966
Lastpage :
972
Abstract :
We propose a statistical method for uncovering gene pathways that characterize cancer heterogeneity. To incorporate knowledge of the pathways into the model, we define a set of activities of pathways from microarray gene expression data based on the Sparse Probabilistic Principal Component Analysis (SPPCA). A pathway activity logistic regression model is then formulated for cancer phenotype. To select pathway activities related to binary cancer phenotypes, we use the elastic net for the parameter estimation and derive a model selection criterion for selecting tuning parameters included in the model estimation. Our proposed method can also reverse-engineer gene networks based on the identified multiple pathways that enables us to discover novel gene-gene associations relating with the cancer phenotypes. We illustrate the whole process of the proposed method through the analysis of breast cancer gene expression data.
Keywords :
bioinformatics; biological techniques; biomedical engineering; cancer; data mining; genetics; medical computing; parameter estimation; principal component analysis; regression analysis; SPPCA; breast cancer gene expression data; cancer characteristics; cancer heterogeneity; cancer phenotype; elastic net; gene pathway identification; gene-gene associations; microarray gene expression data; model selection criterion; parameter estimation; pathway activity logistic regression model; principal component analysis; sparse probabilistic PCA; sparse statistical methods; Breast cancer; Gene expression; Loading; Logistics; Regression analysis; Supervised learning; Cancer heterogeneity; gene network; microarray; pathway activity; sparse supervised learning.; Breast Neoplasms; Computational Biology; Databases, Genetic; Female; Gene Expression Profiling; Gene Regulatory Networks; Humans; Logistic Models; Oligonucleotide Array Sequence Analysis; Principal Component Analysis;
fLanguage :
English
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
1545-5963
Type :
jour
DOI :
10.1109/TCBB.2012.48
Filename :
6175008
Link To Document :
بازگشت