Title :
A Transcriptional Approach to Gene Clustering
Author :
Tagkopoulos, Ilias
Author_Institution :
Department of Electrical Engineering Princeton University, iliast@princeton.edu
Abstract :
We present an integrative method for clustering coregulated genes and elucidating their underlying regulatory mechanisms. We use multi-state partition functions and thermodynamic models to derive six distinct correlation classes that correspond to various Protein-Protein and Protein-DNA interactions. We then introduce a biclustering algorithm for clustering genes based on the correlations exhibited in their expression profiles. We evaluate the functional enrichment and statistical significance of the resulting clusters using precision-recall curves. Our results show that classification performance can be optimized by selecting the corresponding correlation class. Additionally, there is a significant improvement over single class biclustering when we use multi-class support vector machines and biclustering scores as features. Furthermore, the analysis of the upstream regions of all genes comprising each cluster shows that the derived correlation classes capture the expression of genes with shared regulation. We identify over a hundred highly conserved sequences, among which twenty one match well-known regulatory motifs. Further analysis of the identified conserved sequences provides not only an explanation of the classification performance, but serves also as an indicator of the regulatory correlation for various groups.
Keywords :
Clustering algorithms; DNA; Gene expression; Partitioning algorithms; Performance analysis; Protein engineering; Sequences; Support vector machine classification; Support vector machines; Thermodynamics;
Conference_Titel :
Computational Intelligence in Bioinformatics and Computational Biology, 2005. CIBCB '05. Proceedings of the 2005 IEEE Symposium on
Print_ISBN :
0-7803-9387-2
DOI :
10.1109/CIBCB.2005.1594921