Title :
A statistical test for intrinsically multivariate predictive genes
Author :
Ting Chen ; Braga-Neto, Ulisses
Author_Institution :
Dept. of Electr. & Comput. Eng., Texas A&M Univ., College Station, TX, USA
Abstract :
Canalizing genes possess broad regulatory power over biological processes. In a previous publication, it was hypothesized that canalizing genes may be identified as those that possess a large number of intrinsically multivariate predictive (IMP) gene sets. An IMP set is a predictor set that predicts well the target, which cannot however be predicted by any of the proper subsets in the IMP set. The IMP property is defined mathematically in terms of a score based on the binary Coefficient of Determination (CoD), but there was until now no rigorous statistical procedure to obtain a test for a nonzero IMP score. In this paper, we address this by assuming a stochastic regulation model and providing an Intersection-Union Test (IUT) based on likelihood-ratio tests for the individual model parameters. We derive exact analytical formulas for the rejection region and p-value for this test. We address the issue of multiplicity of tests due to a large number of candidate predictor sets and predictive logics by means of FWER- and FDR-controlling approaches. The methodology is demonstrated by application to a real melanoma gene-expression data set, which shows that DUSP1, a canalizing gene over important biological processes in melanoma, displayed a large number of significant IMP sets, after correction by either of the multiple testing procedures.
Keywords :
biology computing; cancer; genetics; molecular biophysics; statistical analysis; stochastic processes; DUSP1; FDR-controlling approach; FWER-controlling approach; binary determination coefficient; biological process; canalizing genes; intersection-union test; intrinsically multivariate predictive genes; likelihood-ratio tests; melanoma gene-expression data set; p-value; predictive logics; predictor sets; rejection region; statistical test; stochastic regulation model; Coefficient of Determination; Hypothesis Test; Intrinsically Multivariate Prediction; Multiple Testing Procedures; Stochastic Logic;
Conference_Titel :
Genomic Signal Processing and Statistics, (GENSIPS), 2012 IEEE International Workshop on
Conference_Location :
Washington, DC
Print_ISBN :
978-1-4673-5234-5
DOI :
10.1109/GENSIPS.2012.6507751