• DocumentCode
    1550726
  • Title

    Using Evolutive Summary Counters for Efficient Cooperative Caching in Search Engines

  • Author

    Dominguez-Sal, David ; Aguilar-Saborit, Josep ; Surdeanu, Mihai ; Larriba-Pey, Josep Lluis

  • Author_Institution
    DAMA, UPC (Univ. Politec. de Catalunya), Barcelona, Spain
  • Volume
    23
  • Issue
    4
  • fYear
    2012
  • fDate
    4/1/2012 12:00:00 AM
  • Firstpage
    776
  • Lastpage
    784
  • Abstract
    We propose and analyze a distributed cooperative caching strategy based on the Evolutive Summary Counters (ESC), a new data structure that stores an approximated record of the data accesses in each computing node of a search engine. The ESC capture the frequency of accesses to the elements of a data collection, and the evolution of the access patterns for each node in a network of computers. The ESC can be efficiently summarized into what we call ESC-summaries to obtain approximate statistics of the document entries accessed by each computing node. We use the ESC-summaries to introduce two algorithms that manage our distributed caching strategy, one for the distribution of the cache contents, ESC-placement, and another one for the search of documents in the distributed cache, ESC-search. While the former improves the hit rate of the system and keeps a large ratio of data accesses local, the latter reduces the network traffic by restricting the number of nodes queried to find a document. We show that our cooperative caching approach outperforms state-of-the-art models in both hit rate, throughput, and location recall for multiple scenarios, i.e., different query distributions and systems with varying degrees of complexity.
  • Keywords
    approximation theory; cache storage; distributed processing; evolutionary computation; search engines; statistics; ESC-search; ESC-summaries; approximate statistics; complexity degree; data access; distributed cooperative caching strategy; evolutive summary counters; query distribution; search engines; Cooperative caching; Data structures; Peer to peer computing; Proposals; Radiation detectors; Search engines; Distributed systems; count filter.; distributed caching; resource intensive applications;
  • fLanguage
    English
  • Journal_Title
    Parallel and Distributed Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9219
  • Type

    jour

  • DOI
    10.1109/TPDS.2011.162
  • Filename
    5871600