• DocumentCode
    2400022
  • Title

    Clustering Ontology-enriched Graph Representation for Biomedical Documents based on Scale-Free Network Theory

  • Author

    Yoo, Illhoi ; Hu, Xiaohua

  • Author_Institution
    Coll. of Inf. Sci. & Technol., Drexel Univ., Philadelphia, PA
  • fYear
    2006
  • fDate
    Sept. 2006
  • Firstpage
    851
  • Lastpage
    858
  • Abstract
    In this paper we introduce a novel document clustering approach that solves some major problems of traditional document clustering approaches. Instead of depending on traditional vector space model, this approach represents documents as graphs using domain knowledge in ontology because graphs can represent the semantic relationships among the concepts in documents. Based on scale-free network theory, our approach generates a model for each document cluster from the ontology-enriched graph representation by identifying k high density subgraphs capturing the core semantic relationship information about each document cluster. Using these k high density subgraphs, each document is assigned to a proper document cluster. Our extensive experimental results on MEDLINE articles show that our approach outperforms two leading document clustering algorithms, BiSecting K-means and CLUTO´s vcluster. Moreover, our approach provides a meaningful explanation for document clustering through generated models. This explanation helps users to understand clustering results and documents as a whole
  • Keywords
    complex networks; document handling; medical administrative data processing; network theory (graphs); ontologies (artificial intelligence); pattern clustering; MEDLINE articles; biomedical documents; document clustering; domain knowledge; graph clustering; high density subgraphs; ontology-enriched graph representation; scale-free network theory; semantic relationships; Clustering algorithms; Educational institutions; Engineering profession; Information retrieval; Nearest neighbor searches; Neoplasms; Network theory (graphs); Ontologies; Text mining; Vocabulary; document clustering; graph clustering; ontology; scale-free network;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Systems, 2006 3rd International IEEE Conference on
  • Conference_Location
    London
  • Print_ISBN
    1-4244-01996-8
  • Electronic_ISBN
    1-4244-01996-8
  • Type

    conf

  • DOI
    10.1109/IS.2006.348532
  • Filename
    4155539