DocumentCode
3627244
Title
Vector model improvement using suffix trees
Author
Jan Martinovic;Tomas Novosad;Vaclav Snasel
Author_Institution
Faculty of Electrical Engineering and Computer Science, V?B - Technical University of Ostrava, The Czech Republic
Volume
1
fYear
2007
Firstpage
180
Lastpage
187
Abstract
There are many ways how to search for documents in document collections. These methods take advantage of Boolean, vector, probabilistic and other models for representation of documents, queries, rules and procedures which can determine correspondence between user requests and documents. Each of these models have several restrictions. These restrictions do not allow a user to find all relevant documents. There are many irrelevant documents among returned ones by the system and some relevant documents missing at all. In the article there is a new method suggested which uses suffix trees for the vector query improvement. This method treats with documents as a, set of phrases (sentences) not just as a set of words. The sentence has a specific, semantic meaning (words in the sentence are ordered). This is advantage in comparison with the treated document just like with, a bag of words.
Keywords
"Computer science","Information retrieval","Clustering methods","Couplings","Indexing","Internet"
Publisher
ieee
Conference_Titel
Digital Information Management, 2007. ICDIM ´07. 2nd International Conference on
Print_ISBN
978-1-4244-1475-8
Type
conf
DOI
10.1109/ICDIM.2007.4444220
Filename
4444220
Link To Document