DocumentCode :
2764904
Title :
Ontology-based functional classification of genes: Evaluation with reference sets and overlap analysis
Author :
Benabderrahmane, Sidahmed ; Devignes, Marie Dominique ; Tabbone, Malika Smail ; Napoli, Amedeo ; Poch, Olivier
Author_Institution :
LORIA, Nancy-Univ., Vandoeuvre-lès-Nancy, France
fYear :
2011
fDate :
12-15 Nov. 2011
Firstpage :
201
Lastpage :
208
Abstract :
Functional classification involves grouping genes according to their molecular functions or the biological processes they participate in. This unsupervised classification task is essential for interpreting gene datasets produced by post-genomic experiments. As the functional annotation of genes is mostly based on the Gene Ontology (GO), many similarity measures using the GO have been described, but few of them have been used for clustering. In this paper we evaluate functional classification of genes using our previously described IntelliGO semantic similarity measure with the help of reference sets. These sets consist of genes taken from human and yeast KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways and Pfam clans. Hierarchical clustering and heatmap visualization are used to illustrate the advantages of IntelliGO over several other measures. Because genes often belong to more than one reference set, the fuzzy C-means clustering algorithm is then applied to the datasets using IntelliGO. The F-score method is used to estimate the quality of clustering and the optimal number of clusters. The results are compared with those obtained from the state-of-the-art DAVID (Database for Annotation Visualization and Integrated Discovery) functional classification method. Overlap analysis allows to study the matching between clusters and reference sets, and leads us to propose a set-difference method for discovering missing information. The IntelliGO similarity measure, the clustering tool and the reference sets used for evaluation are available at: http://plateforme-mbi.loria.fr/intelligo.
Keywords :
biology computing; data visualisation; genetics; ontologies (artificial intelligence); pattern classification; pattern clustering; F-score method; IntelliGO semantic similarity measure; fuzzy C-means clustering algorithm; gene classification; gene functional annotation; gene ontology; heatmap visualization; hierarchical clustering; ontology-based functional classification; overlap analysis; set-difference method; unsupervised classification task; Algorithm design and analysis; Clustering algorithms; Databases; Heating; Humans; Semantics; Visualization; Gene Ontology; Semantic similarity measure; fuzzy clustering; gene functional classification; overlap analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Bioinformatics and Biomedicine Workshops (BIBMW), 2011 IEEE International Conference on
Conference_Location :
Atlanta, GA
Print_ISBN :
978-1-4577-1612-6
Type :
conf
DOI :
10.1109/BIBMW.2011.6112375
Filename :
6112375
Link To Document :
بازگشت