DocumentCode
1478784
Title
Identifying Gene Pathways Associated with Cancer Characteristics via Sparse Statistical Methods
Author
Kawano, Shuichi ; Shimamura, Teppei ; Niida, Atsushi ; Imoto, Seiya ; Yamaguchi, Rui ; Nagasaki, Masao ; Yoshida, Ryo ; Print, Cristin ; Miyano, Satoru
Author_Institution
Dept. of Math. Sci., Osaka Prefecture Univ., Sakai, Japan
Volume
9
Issue
4
fYear
2012
Firstpage
966
Lastpage
972
Abstract
We propose a statistical method for uncovering gene pathways that characterize cancer heterogeneity. To incorporate knowledge of the pathways into the model, we define a set of activities of pathways from microarray gene expression data based on the Sparse Probabilistic Principal Component Analysis (SPPCA). A pathway activity logistic regression model is then formulated for cancer phenotype. To select pathway activities related to binary cancer phenotypes, we use the elastic net for the parameter estimation and derive a model selection criterion for selecting tuning parameters included in the model estimation. Our proposed method can also reverse-engineer gene networks based on the identified multiple pathways that enables us to discover novel gene-gene associations relating with the cancer phenotypes. We illustrate the whole process of the proposed method through the analysis of breast cancer gene expression data.
Keywords
bioinformatics; biological techniques; biomedical engineering; cancer; data mining; genetics; medical computing; parameter estimation; principal component analysis; regression analysis; SPPCA; breast cancer gene expression data; cancer characteristics; cancer heterogeneity; cancer phenotype; elastic net; gene pathway identification; gene-gene associations; microarray gene expression data; model selection criterion; parameter estimation; pathway activity logistic regression model; principal component analysis; sparse probabilistic PCA; sparse statistical methods; Breast cancer; Gene expression; Loading; Logistics; Regression analysis; Supervised learning; Cancer heterogeneity; gene network; microarray; pathway activity; sparse supervised learning.; Breast Neoplasms; Computational Biology; Databases, Genetic; Female; Gene Expression Profiling; Gene Regulatory Networks; Humans; Logistic Models; Oligonucleotide Array Sequence Analysis; Principal Component Analysis;
fLanguage
English
Journal_Title
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher
ieee
ISSN
1545-5963
Type
jour
DOI
10.1109/TCBB.2012.48
Filename
6175008
Link To Document