Title :
An Information Theoretic Exploratory Method for Learning Patterns of Conditional Gene Coexpression from Microarray Data
Author :
Boscolo, Riccardo ; Liao, James C. ; Roychowdhury, Vwani P.
Author_Institution :
Univ. of California, Los Angeles
Abstract :
In this paper, we introduce an exploratory framework for learning patterns of conditional coexpression in gene expression data. The main idea behind the proposed approach consists of estimating how the information content shared by a set of M nodes in a network (where each node is associated to an expression profile) varies upon conditioning on a set of L conditioning variables (in the simplest case represented by a separate set of expression profiles). The method is nonparametric, and it is based on the concept of statistical coinformation, which, unlike conventional correlation-based techniques, is not restricted in scope to linear conditional dependency patterns. Moreover, such conditional coexpression relationships can potentially indicate regulatory interactions that do not manifest themselves when only pairwise relationships are considered. A moment-based approximation of the coinformation measure is derived that efficiently gets around the problem of estimating high-dimensional multivariate probability density functions from the data, a task usually not viable due to the intrinsic sample size limitations that characterize expression-level measurements. By applying the proposed exploratory method, we analyzed a whole genome microarray assay of the eukaryote Saccharomices cerevisiae and were able to learn statistically significant patterns of conditional coexpression. A selection of such interactions that carry a meaningful biological interpretation are discussed.
Keywords :
DNA; arrays; biological techniques; cellular biophysics; correlation methods; density functional theory; genetics; molecular biophysics; biological interpretation; conditional gene coexpression; correlation-based techniques; eukaryote Saccharomices cerevisiae; expression-level measurements; genome microarray assay; high-dimensional multivariate probability density functions; information theoretic exploratory method; learning patterns; linear conditional dependency patterns; statistical coinformation; Co-information; Entropy; Gene expression data; Information theory; Statistical analysis; Algorithms; Artificial Intelligence; Computational Biology; Gene Expression Profiling; Gene Expression Regulation, Fungal; Internet; Oligonucleotide Array Sequence Analysis; Pattern Recognition, Automated; Saccharomyces cerevisiae Proteins; Software; Statistics, Nonparametric;
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
DOI :
10.1109/TCBB.2007.1056