DocumentCode :
2430180
Title :
Long-term relevance feedback using simple PCA and linear transformation
Author :
Tai, Xiaoying ; Ren, Fuji ; Kita, Kenji
Author_Institution :
Fac. of Eng., Tokushima Univ., Japan
fYear :
2002
fDate :
2-6 Sept. 2002
Firstpage :
261
Lastpage :
265
Abstract :
This paper proposes a new method to improve information retrieval performance of the vector space model (VSM) in part by preserving user-supplied relevance information in the long term in the system. The proposed method incorporates user relevance feedback information and original document similarity information into the retrieval model that is built using a sequence of linear transformations. High-dimensional and sparse vectors are mapped into the a low-dimensional vector space, namely the space representing the latent semantic meanings of words, by using SPCA (simple principal component analysis). An experimental information retrieval system based on the proposed method has been built. Experiments on the Medline collection and Cranfield collection have been carried out. Improved average precision compared with the LSI (latent semantic indexing) model, are 6.80% (Medline) and 67.46% (Cranfield) for the two training data sets, and 4.71% (Medline) and 8.12% (Cranfield) for the test data, respectively. The results of our experiments show that the proposed method has better retrieval performance and provides an approach that makes it possible to preserve user-supplied relevance information in the long term in the system in order to use it later.
Keywords :
indexing; principal component analysis; relevance feedback; Cranfield collection; Medline collection; average precision; document similarity information; high-dimensional vectors; information retrieval performance; latent semantic indexing model; latent semantic meanings; linear transformation; long-term relevance feedback; low-dimensional vector space; simple principal component analysis; sparse vectors; training data sets; user-supplied relevance information; vector space model; Feedback; Functional analysis; Indexing; Information retrieval; Large scale integration; Multidimensional systems; Principal component analysis; Testing; Training data; Vectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Database and Expert Systems Applications, 2002. Proceedings. 13th International Workshop on
ISSN :
1529-4188
Print_ISBN :
0-7695-1668-8
Type :
conf
DOI :
10.1109/DEXA.2002.1045909
Filename :
1045909
Link To Document :
بازگشت