• DocumentCode
    430769
  • Title

    Finding related documents via communities in the citation graph

  • Author

    Wanjantuk, Panupong ; Keane, John A.

  • Author_Institution
    Dept. of Comput. Eng., Khon Kaen Univ., Thailand
  • Volume
    1
  • fYear
    2004
  • fDate
    26-29 Oct. 2004
  • Firstpage
    445
  • Abstract
    A body of scientific literature can be represented by a directed graph, which is commonly referred to as the citation graph. This paper describes an implementation of the random walk graph clustering algorithm to identify the communities within the citation graph. Based only on the link structure of the citation graph, the random walk graph clustering algorithm is able to efficiently identity highly topically related communities. The ability to identify community structure in the citation graph clearly has practical application. Communities in the citation graph represent related papers on a single topic. This approach is used as a method for finding related papers to a paper of interest by treating member papers in the same community of papers of interest as related papers. The performance of the random walk graph clustering algorithm was evaluated compared to the method for finding related papers used by ResearchIndex, CCIDF. The experimental results show that the random walk graph clustering algorithm performs better than the CCIDF.
  • Keywords
    Web sites; citation analysis; data mining; directed graphs; information retrieval; pattern clustering; randomised algorithms; citation graph; community structure; directed graph; performance; random walk graph clustering algorithm; related documents; related papers; scientific literature; topically related communities; Clustering algorithms; Databases; Delay; Graph theory; Indexes; Indexing; Internet; Publishing; Software libraries; Web sites;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Communications and Information Technology, 2004. ISCIT 2004. IEEE International Symposium on
  • Print_ISBN
    0-7803-8593-4
  • Type

    conf

  • DOI
    10.1109/ISCIT.2004.1412885
  • Filename
    1412885