DocumentCode :
2640211
Title :
Tree indexing for efficient search of similar documents
Author :
Chen, Chung-Min ; Liu, Duen-Ren
Author_Institution :
Telcordia Technol., Morristown, NJ, USA
fYear :
2000
fDate :
2000
Firstpage :
210
Lastpage :
211
Abstract :
Linear algebra-based techniques have long been used to correlate similar documents. They map the documents to a multidimensional vector space, in which each document is represented by a vector. Searching related documents then translates into searching nearest neighbors in the vector space. We propose an indexing structure, called cosine R-tree, which indexes multidimensional vector space and provides efficient nearest neighbor search. Our preliminary results show that it gives better performance than a brute-force linear scan strategy
Keywords :
database theory; indexing; information retrieval; search problems; cosine R-tree; efficient search; information retrieval; linear algebra-based techniques; linear scan; multi-dimensional vector space; multidimensional vector space; nearest neighbor search; nearest neighbors; related documents; similar documents; tree indexing; vector space approach; Euclidean distance; Indexing; Information management; Information retrieval; Multidimensional systems; Nearest neighbor searches; Space technology; Vectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Software and Applications Conference, 2000. COMPSAC 2000. The 24th Annual International
Conference_Location :
Taipei
ISSN :
0730-3157
Print_ISBN :
0-7695-0792-1
Type :
conf
DOI :
10.1109/CMPSAC.2000.884720
Filename :
884720
Link To Document :
بازگشت