DocumentCode :
2839192
Title :
Availability and Accuracy of Distributed Web Crawlers: A Model-Based Evaluation
Author :
Nasri, Mitra ; Shariati, Saeed ; Sharifi, Mohsen
Author_Institution :
Comput. Eng. Dept., Iran Univ. of Sci. & Technol., Tehran
fYear :
2008
fDate :
8-10 Sept. 2008
Firstpage :
453
Lastpage :
458
Abstract :
Distributed Web crawlers are extensively used for Web mining nowadays, but their accuracy, dependability and other operational measures have not been fully studied. Distributed Web crawlers are costly and require careful selection of configuration parameters. It is important to have some estimation about the performance, dependability and accuracy of a Web crawler. This paper presents a model-based evaluation of the accuracy and availability of a distributed Web crawler whose architecture is based on UbiCrawler. Stochastic activity networks are used for modelling the crawler. Accuracy and availability of the Web crawler are formally defined, and the effects of environmental failure rates on crawling nodes and on the availability of the whole system are discussed.
Keywords :
Internet; data mining; software performance evaluation; UbiCrawler; Web crawler architecture; Web mining; distributed Web crawlers; environmental failure rate; model-based evaluation; stochastic activity networks; Availability; Computational modeling; Computer simulation; Crawlers; Distributed computing; Maintenance; Safety; Service oriented architecture; Stochastic processes; Web mining; Accuracy; Availability; Distributed Web Crawlers; Mobius; Modeling; SAN; Stochastic Activity Networks; Web Crawlers;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Modeling and Simulation, 2008. EMS '08. Second UKSIM European Symposium on
Conference_Location :
Liverpool
Print_ISBN :
978-0-7695-3325-4
Electronic_ISBN :
978-0-7695-3325-4
Type :
conf
DOI :
10.1109/EMS.2008.55
Filename :
4625316
Link To Document :
بازگشت