• DocumentCode
    3466432
  • Title

    Integrating Semantic Knowledge into Text Similarity and Information Retrieval

  • Author

    Müller, Christof ; Gurevych, Iryna ; Muhlhauser, Max

  • Author_Institution
    Darmstadt Univ. of Technol., Darmstadt
  • fYear
    2007
  • fDate
    17-19 Sept. 2007
  • Firstpage
    257
  • Lastpage
    264
  • Abstract
    This paper studies the influence of lexical semantic knowledge upon two related tasks: ad-hoc information retrieval and text similarity. For this purpose, we compare the performance of two algorithms: (i) using semantic relatedness, and (ii) using a conventional extended Boolean model [12]. For the evaluation, we use two different test collections in the German language: (i) GIRT [5] for the information retrieval task, and (ii) a collection of descriptions of professions built to evaluate a system for electronic career guidance in the information retrieval and text similarity task. We found that integrating lexical semantic knowledge improves performance for both tasks. On the GIRT corpus, the performance is improved only for short queries. The performance on the collection of professional descriptions is improved, but crucially depends on the preprocessing of natural language essays employed as topics.
  • Keywords
    computational linguistics; information retrieval; natural languages; text analysis; Boolean model; GIRT corpus; German language; ad-hoc information retrieval; lexical semantic knowledge; natural language essays; semantic relatedness; text similarity; Electronic equipment testing; Engineering profession; Information retrieval; Natural languages; Pervasive computing; Strontium; System testing; Thesauri; Vocabulary; Writing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Semantic Computing, 2007. ICSC 2007. International Conference on
  • Conference_Location
    Irvine, CA
  • Print_ISBN
    978-0-7695-2997-4
  • Type

    conf

  • DOI
    10.1109/ICSC.2007.12
  • Filename
    4338357