مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

633704

Title :

Invited Abstract: Ricardo Baez-Yates

Author :

Baeza-Yates, R.

Author_Institution :

Yahoo! Labs., Barcelona, Spain

fYear :

2013

fDate :

8-10 July 2013

Abstract :

In the dynamic ocean of web data, where we have over 200 million websites, web search engines are the primary way to access content. As the data is on the order of petabytes, current search engines are very large centralized systems based on replicated clusters, where easily more than 100 billion web pages are indexed. On the other hand, Internet users are above two billion and hundreds of million of queries are issued each day. In the near future, centralized systems are likely to become less effective against such a data-query load, thus suggesting the need of fully distributed search engines. Such engines need to maintain high quality answers, fast response time, high query throughput, high availability and scalability; in spite of network latency and scattered data. In this talk we present the main challenges behind the design of a distributed web retrieval system and our research in all the components of a search engine: crawling, indexing, and query processing.

Keywords :

indexing; information retrieval systems; query processing; search engines; Web data; Web search engine; crawling component; data-query load; distributed search engine; distributed web retrieval system; indexing component; query processing component; replicated clusters;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Application of Concurrency to System Design (ACSD), 2013 13th International Conference on

Conference_Location :

Barcelona

Type :

conf

DOI :

10.1109/ACSD.2013.38

Filename :

6598332

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=633704