• DocumentCode
    3627244
  • Title

    Vector model improvement using suffix trees

  • Author

    Jan Martinovic;Tomas Novosad;Vaclav Snasel

  • Author_Institution
    Faculty of Electrical Engineering and Computer Science, V?B - Technical University of Ostrava, The Czech Republic
  • Volume
    1
  • fYear
    2007
  • Firstpage
    180
  • Lastpage
    187
  • Abstract
    There are many ways how to search for documents in document collections. These methods take advantage of Boolean, vector, probabilistic and other models for representation of documents, queries, rules and procedures which can determine correspondence between user requests and documents. Each of these models have several restrictions. These restrictions do not allow a user to find all relevant documents. There are many irrelevant documents among returned ones by the system and some relevant documents missing at all. In the article there is a new method suggested which uses suffix trees for the vector query improvement. This method treats with documents as a, set of phrases (sentences) not just as a set of words. The sentence has a specific, semantic meaning (words in the sentence are ordered). This is advantage in comparison with the treated document just like with, a bag of words.
  • Keywords
    "Computer science","Information retrieval","Clustering methods","Couplings","Indexing","Internet"
  • Publisher
    ieee
  • Conference_Titel
    Digital Information Management, 2007. ICDIM ´07. 2nd International Conference on
  • Print_ISBN
    978-1-4244-1475-8
  • Type

    conf

  • DOI
    10.1109/ICDIM.2007.4444220
  • Filename
    4444220