• DocumentCode
    2505644
  • Title

    Extracting Named Entities and Synonyms from Wikipedia

  • Author

    Bøhn, Christian ; Nørvåg, Kjetil

  • Author_Institution
    Dept. of Comput. Sci., Norwegian Univ. of Sci. & Technol., Trondheim, Norway
  • fYear
    2010
  • fDate
    20-23 April 2010
  • Firstpage
    1300
  • Lastpage
    1307
  • Abstract
    In many search domains, both contents and searches are frequently tied to named entities such as a person, a company or similar. An example of such a domain is a news archive. One challenge from an information retrieval point of view is that a single entity can have more than one way of referring to it. In this paper we describe how to use Wikipedia contents to automatically generate a dictionary of named entities and synonyms that are all referring to the same entity. This dictionary can subsequently be used to improve search quality, for example using query expansion. Through an experimental evaluation we show that with our approach, we can find named entities and their synonyms with a high degree of accuracy.
  • Keywords
    knowledge acquisition; query processing; Wikipedia contents; information retrieval; named entity extraction; query expansion; search quality; Application software; Computer science; Data mining; Dictionaries; Electronic mail; Filters; Information retrieval; Search engines; Text recognition; Wikipedia;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advanced Information Networking and Applications (AINA), 2010 24th IEEE International Conference on
  • Conference_Location
    Perth, WA
  • ISSN
    1550-445X
  • Print_ISBN
    978-1-4244-6695-5
  • Type

    conf

  • DOI
    10.1109/AINA.2010.50
  • Filename
    5474864