DocumentCode
2400022
Title
Clustering Ontology-enriched Graph Representation for Biomedical Documents based on Scale-Free Network Theory
Author
Yoo, Illhoi ; Hu, Xiaohua
Author_Institution
Coll. of Inf. Sci. & Technol., Drexel Univ., Philadelphia, PA
fYear
2006
fDate
Sept. 2006
Firstpage
851
Lastpage
858
Abstract
In this paper we introduce a novel document clustering approach that solves some major problems of traditional document clustering approaches. Instead of depending on traditional vector space model, this approach represents documents as graphs using domain knowledge in ontology because graphs can represent the semantic relationships among the concepts in documents. Based on scale-free network theory, our approach generates a model for each document cluster from the ontology-enriched graph representation by identifying k high density subgraphs capturing the core semantic relationship information about each document cluster. Using these k high density subgraphs, each document is assigned to a proper document cluster. Our extensive experimental results on MEDLINE articles show that our approach outperforms two leading document clustering algorithms, BiSecting K-means and CLUTO´s vcluster. Moreover, our approach provides a meaningful explanation for document clustering through generated models. This explanation helps users to understand clustering results and documents as a whole
Keywords
complex networks; document handling; medical administrative data processing; network theory (graphs); ontologies (artificial intelligence); pattern clustering; MEDLINE articles; biomedical documents; document clustering; domain knowledge; graph clustering; high density subgraphs; ontology-enriched graph representation; scale-free network theory; semantic relationships; Clustering algorithms; Educational institutions; Engineering profession; Information retrieval; Nearest neighbor searches; Neoplasms; Network theory (graphs); Ontologies; Text mining; Vocabulary; document clustering; graph clustering; ontology; scale-free network;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Systems, 2006 3rd International IEEE Conference on
Conference_Location
London
Print_ISBN
1-4244-01996-8
Electronic_ISBN
1-4244-01996-8
Type
conf
DOI
10.1109/IS.2006.348532
Filename
4155539
Link To Document