Title :
Using path length measure for gene clustering based on similarity of annotation terms
Author :
Nagar, Anurag ; Al-Mubaid, Hisham
Author_Institution :
Univ. of Houston-Clear Lake, Houston, TX
Abstract :
The application of semantic similarity measures on gene data using Gene Ontology (GO) and gene annotation information is becoming more widely used and acceptable in the recent years in bioinformatics. The purpose of this application can range from gene similarity to gene clustering. In this paper, we investigate a simple measure for gene similarity that relies on the path length between the GO annotation terms of genes to determine the similarity between them. The similarity values computed by the proposed measure for a set of genes will then be used for clustering the genes. In the evaluation, we compared the proposed measure with two widely used information-theoretic similarity measures, Resnik and Lin, using three datasets of genes. The experimental results and analysis of clusters validated the effectiveness of the proposed path length measure.
Keywords :
biology computing; data analysis; genetics; information theory; pattern clustering; annotation terms similarity; bioinformatics; gene annotation information; gene clustering; gene data; gene ontology; information-theoretic similarity measures; path length measure; semantic similarity measures; Bioinformatics; Biology computing; Biomedical measurements; Clustering algorithms; Clustering methods; Lakes; Length measurement; Ontologies; Proteins; Time measurement; Gene clustering; gene similarity;
Conference_Titel :
Computers and Communications, 2008. ISCC 2008. IEEE Symposium on
Conference_Location :
Marrakech
Print_ISBN :
978-1-4244-2702-4
Electronic_ISBN :
1530-1346
DOI :
10.1109/ISCC.2008.4625765