DocumentCode :
475874
Title :
A Hierarchical Cache Scheme for the Large-scale Web Search Engine
Author :
Lim, Sungchae ; Ahn, Joonseon
Author_Institution :
Dongduk Women´´s Univ., Seoul
fYear :
2008
fDate :
6-8 Aug. 2008
Firstpage :
925
Lastpage :
930
Abstract :
Over the past decade, much research has been done to solve technical challenges regarding the Web search engine, such as crawling Web documents, high performance indexes, and ranking systems using hyperlink analysis. However, implementation details of its query processing system are rarely dealt with in the literature. In this paper we present a distributed architecture for the query processing system and its hierarchal cache scheme. Our paper is based on the development experience of a commercial Web search engine designed to answer 5 million user queries against over 6.5 million Web pages per day. Using the hierarchal cache scheme, we keep a portion of query results in multi-level caches so that excessive I/O or CPU time is not used for query processing. With that scheme, it is possible to reduce around 70% of the server costs.
Keywords :
Internet; cache storage; query processing; search engines; Web document crawling; distributed architecture; hierarchical cache scheme; hyperlink analysis; large-scale Web search engine; query processing system; ranking system; Costs; Internet; Large-scale systems; Performance analysis; Performance evaluation; Query processing; Search engines; Uniform resource locators; Web search; Web server; large-scale cache; searche engine;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing, 2008. SNPD '08. Ninth ACIS International Conference on
Conference_Location :
Phuket
Print_ISBN :
978-0-7695-3263-9
Type :
conf
DOI :
10.1109/SNPD.2008.107
Filename :
4617487
Link To Document :
بازگشت