Title :
Index structures for information filtering under the vector space model
Author :
Yan, Tak W. ; Garcia-Molina, Hector
Author_Institution :
Dept. of Comput. Sci., Stanford Univ., CA, USA
Abstract :
The authors study what data structures and algorithms can be used to efficiently perform large-scale information filtering under the vector space model, a retrieval model established as being effective. They apply the idea of the standard inverted index to index user profiles. They devise an alternative to the standard inverted index, in which they, instead of indexing every term in a profile, select only the significant ones to index. They evaluate their performance and show that the indexing methods require orders of magnitude fewer I/Os to process a document than when no index is used. They also show that the proposed alternative performs better in terms of I/O and CPU processing time in many cases
Keywords :
data structures; database management systems; database theory; information retrieval; data structures; index user profiles; information filtering; retrieval model; standard inverted index; vector space model; Computer science; Data structures; Databases; Filtering algorithms; Information filtering; Information filters; Information retrieval; Information systems; Large-scale systems; Network servers;
Conference_Titel :
Data Engineering, 1994. Proceedings.10th International Conference
Conference_Location :
Houston, TX
Print_ISBN :
0-8186-5402-3
DOI :
10.1109/ICDE.1994.283049