• DocumentCode
    1913693
  • Title

    Extracting Domain-Relevant Term Using Wikipedia Based on Random Walk Model

  • Author

    Wu, Wenjuan ; Liu, Tao ; Hu, He ; Du, Xiaoyong

  • Author_Institution
    Key Labs. of Data Eng. & Knowledge Eng., China
  • fYear
    2012
  • fDate
    20-23 Sept. 2012
  • Firstpage
    68
  • Lastpage
    75
  • Abstract
    In this paper we present a new approach for the automatic identification of domain-relevant concepts and entities of a given domain using the category and page structures of the Wikipedia in a language independent way. By applying Markov random walk algorithm on the weighted Wikipedia link graph, our approach can identify large quantities of domain-relevant concepts and entities with very little human effort. Experimental results show that our method achieves high accuracy and acceptable efficiency in domain-relevant term extraction.
  • Keywords
    Markov processes; Web sites; graph theory; information retrieval; Markov random walk algorithm; domain-relevant concept automatic identification; domain-relevant entity automatic identification; domain-relevant term extraction; weighted Wikipedia link graph; Biological system modeling; Electronic publishing; Encyclopedias; Internet; Ontologies; Semantics; Domain-relevant Concepts; Link Graph; Markov Chain; Random Walk; Wikipedia;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    ChinaGrid Annual Conference (ChinaGrid), 2012 Seventh
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4673-2623-0
  • Electronic_ISBN
    978-0-7695-4816-6
  • Type

    conf

  • DOI
    10.1109/ChinaGrid.2012.20
  • Filename
    6337278