Title :
A Task Scheduling Strategy Based on Weighted Round-Robin for Distributed Crawler
Author :
Dajie Ge ; ZhiJun Ding
Author_Institution :
Dept. of Comput. Sci. & Technol., Tongji Univ. Shanghai, Shanghai, China
Abstract :
With the rapid development of the network, stand-alone crawlers have been hard to find and gather the massive information. The form of crawlers will gradually tend to distributed. This paper proposes a task scheduling strategy based on weighted Round-Robin for small-scale distributed crawler, and formula weights for the current node based on crawling efficiency, so that each node can load balance. The design of the error recovery mechanism and the node table allows crawling nodes have flexible scalability and fault tolerance. Finally, we conducted some experiments to prove the good load balancing performance of the system.
Keywords :
distributed processing; resource allocation; scheduling; task analysis; distributed crawler; load balancing; rapid development; task scheduling strategy; weighted round-robin; Algorithm design and analysis; Crawlers; Schedules; Scheduling; Scheduling algorithms; Uniform resource locators; crawlers; distributed; scheduling; weighted Round-Robin;
Conference_Titel :
Utility and Cloud Computing (UCC), 2014 IEEE/ACM 7th International Conference on
Conference_Location :
London
DOI :
10.1109/UCC.2014.138