DocumentCode :
875882
Title :
Combining Multisource Information Through Functional-Annotation-Based Weighting: Gene Function Prediction in Yeast
Author :
Ray, Shubhra Sankar ; Bandyopadhyay, Sanghamitra ; Pal, Sankar K.
Author_Institution :
Center for Soft Comput. Res., Indian Stat. Inst., Kolkata
Volume :
56
Issue :
2
fYear :
2009
Firstpage :
229
Lastpage :
236
Abstract :
Motivation: One of the important goals of biological investigation is to predict the function of unclassified gene. Although there is a rich literature on multi data source integration for gene function prediction, there is hardly any similar work in the framework of data source weighting using functional annotations of classified genes. In this investigation, we propose a new scoring framework, called biological score (BS) and incorporating data source weighting, for predicting the function of some of the unclassified yeast genes. Methods: The BS is computed by first evaluating the similarities between genes, arising from different data sources, in a common framework, and then integrating them in a linear combination style through weights. The relative weight of each data source is determined adaptively by utilizing the information on yeast gene ontology (GO)-slim process annotations of classified genes, available from Saccharomyces Genome Database (SGD). Genes are clustered by a method called K-BS, where, for each gene, a cluster comprising that gene and its K nearest neighbors is computed using the proposed score (BS). The performances of BS and K-BS are evaluated with gene annotations available from Munich Information Center for Protein Sequences (MIPS). Results: We predict the functional categories of 417 classified genes from 417 clusters with 0.98 positive predictive value using K-BS. The functional categories of 12 unclassified yeast genes are also predicted. Conclusion: Our experimental results indicate that considering multiple data sources and estimating their weights with annotations of classified genes can considerably enhance the performance of BS. It has been found that even a small proportion of annotated genes can provide improvements in finding true positive gene pairs using BS.
Keywords :
bioinformatics; genetics; Saccharomyces Genome Database; combinatorial optimization; combining multisource information; functional-annotation-based weighting; gene expression; gene function prediction; phenotypic profile; protein sequence; transitive homology; yeast gene ontology-slim process annotations; Bayesian methods; Bioinformatics; Biology computing; Clustering algorithms; Databases; Fungi; Gene expression; Genomics; Iron; Nearest neighbor searches; Ontologies; Postal services; Proteins; Throughput; Bioinformatics; combinatorial optimization; gene expression; phenotypic profile; protein sequence; transitive homology; Cluster Analysis; Computational Biology; Databases, Genetic; Gene Expression Profiling; Genes, Fungal; Models, Genetic; Oligonucleotide Array Sequence Analysis; Protein Interaction Mapping; Reproducibility of Results; Saccharomyces cerevisiae; Saccharomyces cerevisiae Proteins; Sequence Analysis, Protein;
fLanguage :
English
Journal_Title :
Biomedical Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9294
Type :
jour
DOI :
10.1109/TBME.2008.2005955
Filename :
4636709
Link To Document :
بازگشت