DocumentCode :
1868498
Title :
Exploiting Tags and Social Profiles to Improve Focused Crawling
Author :
Zhang, Zhiyong ; Nasraoui, Olfa ; Zwol, Roelof Van
Volume :
1
fYear :
2009
fDate :
15-18 Sept. 2009
Firstpage :
136
Lastpage :
139
Abstract :
Recent years have transformed the Web from a Web of content to a Web of applications and social content. Thus, it has become crucial to be able to tap on this social aspect of the Web whenever possible, in addition to its content, particularly for focused crawling. In this paper, we present a novel profile-based focused crawling system for dealing with the increasingly popular social media-sharing web sites without assuming any privileged access to the internal private databases of such websites, nor any requirement for the existence of APIs for the extraction of social data. Our experiments prove the robustness of our profile-based focused crawler, as well as a significant improvement in harvest ratio, compared to breadth-first and OPIC crawlers, when crawling the flickr web site for two different topics.
Keywords :
Application software; Conferences; Crawlers; Data mining; Focusing; Intelligent agent; Learning systems; Multimedia databases; Web pages; Web sites; cotagging; focused crawler; page classification; profile;
fLanguage :
English
Publisher :
iet
Conference_Titel :
Web Intelligence and Intelligent Agent Technologies, 2009. WI-IAT '09. IEEE/WIC/ACM International Joint Conferences on
Conference_Location :
Milan, Italy
Print_ISBN :
978-0-7695-3801-3
Electronic_ISBN :
978-1-4244-5331-3
Type :
conf
DOI :
10.1109/WI-IAT.2009.27
Filename :
5286082
Link To Document :
بازگشت