DocumentCode
3466432
Title
Integrating Semantic Knowledge into Text Similarity and Information Retrieval
Author
Müller, Christof ; Gurevych, Iryna ; Muhlhauser, Max
Author_Institution
Darmstadt Univ. of Technol., Darmstadt
fYear
2007
fDate
17-19 Sept. 2007
Firstpage
257
Lastpage
264
Abstract
This paper studies the influence of lexical semantic knowledge upon two related tasks: ad-hoc information retrieval and text similarity. For this purpose, we compare the performance of two algorithms: (i) using semantic relatedness, and (ii) using a conventional extended Boolean model [12]. For the evaluation, we use two different test collections in the German language: (i) GIRT [5] for the information retrieval task, and (ii) a collection of descriptions of professions built to evaluate a system for electronic career guidance in the information retrieval and text similarity task. We found that integrating lexical semantic knowledge improves performance for both tasks. On the GIRT corpus, the performance is improved only for short queries. The performance on the collection of professional descriptions is improved, but crucially depends on the preprocessing of natural language essays employed as topics.
Keywords
computational linguistics; information retrieval; natural languages; text analysis; Boolean model; GIRT corpus; German language; ad-hoc information retrieval; lexical semantic knowledge; natural language essays; semantic relatedness; text similarity; Electronic equipment testing; Engineering profession; Information retrieval; Natural languages; Pervasive computing; Strontium; System testing; Thesauri; Vocabulary; Writing;
fLanguage
English
Publisher
ieee
Conference_Titel
Semantic Computing, 2007. ICSC 2007. International Conference on
Conference_Location
Irvine, CA
Print_ISBN
978-0-7695-2997-4
Type
conf
DOI
10.1109/ICSC.2007.12
Filename
4338357
Link To Document