• DocumentCode
    2731501
  • Title

    SPRITE: A Learning-Based Text Retrieval System in DHT Networks

  • Author

    Yingguang Li ; Jagadish, H.V. ; Kian-Lee Tan

  • Author_Institution
    National Univ. of Singapore, Singapore
  • fYear
    2007
  • fDate
    15-20 April 2007
  • Firstpage
    1106
  • Lastpage
    1115
  • Abstract
    In this paper, we propose SPRITE (selective progressive index tuning by examples), a scalable system for text retrieval in a structured P2P network. Under SPRITE, each peer is responsible for a certain number of terms. However, for each document, SPRITE learns from (past) queries to select only a small set of representative terms for indexing; and these terms are progressively refined with subsequent queries. We implemented the proposed strategy, and compare its retrieval effectiveness in terms of both precision and recall against a static scheme (without learning) and a centralized system (ideal). Our experimental results show that SPRITE is nearly as effective as the centralized system, and considerably outperforms the static scheme.
  • Keywords
    indexing; information retrieval; peer-to-peer computing; text analysis; DHT network; SPRITE; indexing; learning-based text retrieval system; progressive index tuning by examples; queries; retrieval effectiveness; structured P2P network; Bandwidth; Costs; Floods; Indexing; Large-scale systems; Peer to peer computing; Probes; Routing; Sprites (computer);
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on
  • Conference_Location
    Istanbul
  • Print_ISBN
    1-4244-0802-4
  • Type

    conf

  • DOI
    10.1109/ICDE.2007.368969
  • Filename
    4221759