Title :
A Preliminary Study of Correlation between Depth and Path Length of GO Nodes with Gene Sequence Similarity
Author_Institution :
Univ. of Houston-Clear Lake, Houston
Abstract :
We proposed a new measure (SimPLD) for calculating the semantic similarity of terms in gene ontology (GO) based on the depth of least common ancestor (LCA) of two terms and the path length between them in GO hierarchy. The similarity between genes is computed based on this measure when it is applied to the GO-terms related to those genes. The method is based on the average of SimPLD between the GO terms annotated for both genes in a given gene pair. We evaluated the proposed method with a series of experiments on large groups of genes and proteins from two genomes: Saccharomyces database (SGD) and Drosophila Melanogaster (FlyBase); and one dataset of Human-Yeast protein pairs. The experimental results proved that the method has fairly impressive agreement with Blast sequence similarity. Therefore SimPLD can be used as an automated tool for determining the similarity between genes and proteins.
Keywords :
biology computing; genetics; molecular biophysics; molecular configurations; ontologies (artificial intelligence); proteins; Blast sequence similarity; Drosophila melanogaster; GO nodes; Saccharomyces database; gene ontology; gene sequence similarity; genomes; human-yeast protein pairs; least common ancestor; node path; path length; proteins; semantic similarity; Bioinformatics; Biological control systems; Databases; Genomics; Lakes; Length measurement; Ontologies; Proteins; Testing; Time measurement; Gene Ontology; Least Common Ancestor;
Conference_Titel :
Bioinformatics and Bioengineering, 2007. BIBE 2007. Proceedings of the 7th IEEE International Conference on
Conference_Location :
Boston, MA
Print_ISBN :
978-1-4244-1509-0
DOI :
10.1109/BIBE.2007.4375665