DocumentCode :
3426599
Title :
Relevance weighting of multi-term queries for Vector Space Model
Author :
Wang, Louis S.
Author_Institution :
Int. Sch. of Minnesota, Eden Prairie, MN
fYear :
2009
fDate :
March 30 2009-April 2 2009
Firstpage :
396
Lastpage :
402
Abstract :
The vector space model is one of the most common information retrieval (IR) methods for text document search. The cosine of the angle or the Euclidean distance between the query vector and each document vector is commonly used to measure similarity for query matching. Even though the vector space model starts with a term-by-document matrix, it inevitably loses the information of relations between query terms in the document in the first place. This paper presents a modified vector space model for measuring similarity between the query and the document when responding to a multi-term query. More weight is assigned to the keywords based on the adjacency between the terms in the documents. Thus, when a document contains the adjacency terms, its vector will typically move closer to the query vector to show stronger relevancy between query and the document.
Keywords :
information retrieval; matrix algebra; text analysis; vectors; Euclidean distance; document vector; information retrieval; multiterm queries; query matching; query vector; term-by-document matrix; text document search; vector space model; Databases; Euclidean distance; Extraterrestrial measurements; Impedance; Indexing; Information retrieval; Information systems; Logic; Mathematical model; Web sites;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Data Mining, 2009. CIDM '09. IEEE Symposium on
Conference_Location :
Nashville, TN
Print_ISBN :
978-1-4244-2765-9
Type :
conf
DOI :
10.1109/CIDM.2009.4938677
Filename :
4938677
Link To Document :
بازگشت