DocumentCode
3188902
Title
Textual similarities based on a distributional approach
Author
Besançon, Romaric ; Rajman, Martin ; Chappelier, Jean-Cédric
Author_Institution
Artificial Intelligence Lab., Fed. Inst. of Technol., Lausanne, Switzerland
fYear
1999
fDate
1999
Firstpage
180
Lastpage
184
Abstract
The design of efficient textual similarities is an important issue in the domain of textual data exploration. Textual similarities are for example central in document collection structuring (e.g. clustering), or in information retrieval (IR) which relies on the computation of textual similarities for measuring the adequacy between a query and documents. The objective of this paper is to present and compare several textual similarity measures in the framework of the distributional semantics (DS) model for IR. This model is an extension of the standard vector space model, which further takes the co-frequencies between the terms in a given reference corpus into account. These co-frequencies are considered to provide a distributional representation of the “semantics” of the terms. The co-occurrence profiles are used to represent the documents as vectors. Practical retrieval experiments using DS-based similarity models have been conducted in the framework of the AMARYLLIS evaluation campaign. The results obtained are presented. They indicate significant improvement of the performance in comparison with the standard approach
Keywords
information retrieval; pattern clustering; text analysis; AMARYLLIS evaluation campaign; DS model; IR; clustering; co-frequencies; distributional approach; distributional representation; distributional semantics model; document collection structuring; efficient textual similarity design; information retrieval; textual data exploration; vector space model; Artificial intelligence; Databases; Indexing; Information retrieval; Laboratories; Measurement standards; Optical computing; Standards development;
fLanguage
English
Publisher
ieee
Conference_Titel
Database and Expert Systems Applications, 1999. Proceedings. Tenth International Workshop on
Conference_Location
Florence
Print_ISBN
0-7695-0281-4
Type
conf
DOI
10.1109/DEXA.1999.795163
Filename
795163
Link To Document