DocumentCode :
2687565
Title :
Improving the Efficiency of a Genre-Aware Approach to Focused Crawling Based on Link Context
Author :
Mangaravite, Vitor ; Assis, G.T. ; Ferreira, Anderson A.
Author_Institution :
Comput. Sci. Dept., Fed. Univ. of Ouro Preto, Ouro Preto, Brazil
fYear :
2012
fDate :
25-27 Oct. 2012
Firstpage :
17
Lastpage :
23
Abstract :
Focused crawlers attempt to crawl web pages that are relevant to a specific topic or user interest. Although these kinds of crawlers have been proven to be effective, they need to improve their efficiency. Focused crawlers usually use a Frontier of non-visited URLs to visit the web pages and gather relavant ones. In this work, we define and evaluate a queueing policy of non-visited URLs, based on link context, to improve the efficiency of a genre-aware focused crawler. Our experimental evaluation shows, in some situations, an improvement around 100% in efficiency terms.
Keywords :
Web sites; document handling; program compilers; Web pages; genre-aware focused crawler; link context-based focused crawling; link context-based nonvisited URL; queueing policy evaluation; topic interest; user interest; Computer science; Computers; Context; Crawlers; Databases; Search engines; Web pages; Document Genre Exploitation; Focused Crawling; Link Context;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Congress (LA-WEB), 2012 Eighth Latin American
Conference_Location :
Cartagena de Indias
Print_ISBN :
978-1-4673-4473-9
Type :
conf
DOI :
10.1109/LA-WEB.2012.24
Filename :
6392133
Link To Document :
بازگشت