DocumentCode :
1550726
Title :
Using Evolutive Summary Counters for Efficient Cooperative Caching in Search Engines
Author :
Dominguez-Sal, David ; Aguilar-Saborit, Josep ; Surdeanu, Mihai ; Larriba-Pey, Josep Lluis
Author_Institution :
DAMA, UPC (Univ. Politec. de Catalunya), Barcelona, Spain
Volume :
23
Issue :
4
fYear :
2012
fDate :
4/1/2012 12:00:00 AM
Firstpage :
776
Lastpage :
784
Abstract :
We propose and analyze a distributed cooperative caching strategy based on the Evolutive Summary Counters (ESC), a new data structure that stores an approximated record of the data accesses in each computing node of a search engine. The ESC capture the frequency of accesses to the elements of a data collection, and the evolution of the access patterns for each node in a network of computers. The ESC can be efficiently summarized into what we call ESC-summaries to obtain approximate statistics of the document entries accessed by each computing node. We use the ESC-summaries to introduce two algorithms that manage our distributed caching strategy, one for the distribution of the cache contents, ESC-placement, and another one for the search of documents in the distributed cache, ESC-search. While the former improves the hit rate of the system and keeps a large ratio of data accesses local, the latter reduces the network traffic by restricting the number of nodes queried to find a document. We show that our cooperative caching approach outperforms state-of-the-art models in both hit rate, throughput, and location recall for multiple scenarios, i.e., different query distributions and systems with varying degrees of complexity.
Keywords :
approximation theory; cache storage; distributed processing; evolutionary computation; search engines; statistics; ESC-search; ESC-summaries; approximate statistics; complexity degree; data access; distributed cooperative caching strategy; evolutive summary counters; query distribution; search engines; Cooperative caching; Data structures; Peer to peer computing; Proposals; Radiation detectors; Search engines; Distributed systems; count filter.; distributed caching; resource intensive applications;
fLanguage :
English
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on
Publisher :
ieee
ISSN :
1045-9219
Type :
jour
DOI :
10.1109/TPDS.2011.162
Filename :
5871600
Link To Document :
بازگشت