• DocumentCode
    2895866
  • Title

    Measuring Semantic Relatedness Using Wikipedia Revision Information in a Signed Network

  • Author

    Yang, Wen-Teng ; Kao, Hung-Yu

  • Author_Institution
    Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ. Tainan, Tainan, Taiwan
  • fYear
    2011
  • fDate
    11-13 Nov. 2011
  • Firstpage
    69
  • Lastpage
    74
  • Abstract
    Identifying the semantic relatedness of two words is an important task for the information retrieval, natural language processing, and text mining. However, due to the diversity of meaning for a word, the semantic relatedness of two words is still hard to precisely evaluate under the limited corpora. Nowadays, Wikipedia is now a huge and wiki-based encyclopedia on the internet that has become a valuable resource for research work. Wikipedia articles, written by a live collaboration of user editors, contain a high volume of reference links, URL identification for concepts and a complete revision history. Moreover, each Wikipedia article represents an individual concept that simultaneously contains other concepts that are hyperlinks of other articles embedded in its content. Through this, we believe that the semantic relatedness between two words can be found through the semantic relatedness between two Wikipedia articles. Therefore, we propose an Editor-Contribution-based Rank (ECR) algorithm for ranking the concepts in the article´s content through all revisions and take the ranked concepts as a vector representing the article. We classify four types of relationship in which the behavior of addition and deletion maps appropriate and inappropriate concepts. ECR ranks those concepts depending on the mutual signed-reinforcement relationship between the concepts and the editors. The results reveal that our method leads to prominent performance improvement and increases the correlation coefficient by a factor ranging from 4% to 23% over previous methods that calculate the relatedness between two articles.
  • Keywords
    Web sites; semantic Web; URL identification; Wikipedia revision information; editor contribution based rank algorithm; information retrieval; mutual signed reinforcement relationship; natural language processing; signed network; text mining; wiki based encyclopedia; Correlation; Electronic publishing; Encyclopedias; Internet; Semantics; Vectors; HITS; Semantic relatedness; Wikipedia;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Technologies and Applications of Artificial Intelligence (TAAI), 2011 International Conference on
  • Conference_Location
    Chung-Li
  • Print_ISBN
    978-1-4577-2174-8
  • Type

    conf

  • DOI
    10.1109/TAAI.2011.20
  • Filename
    6120722