Title :
The Design and Implement of High Efficient Incremental Microblogging Crawler
Author :
Dayong Shen ; Hui Wang ; Jianping Cao ; Pei Li ; Zhihong Jiang
Author_Institution :
Res. Center of Comput. Experiments & Parallel Syst. Technol., Nat. Univ. of Defense Technol., Changsha, China
Abstract :
With the rapid development of microblog technology, many interesting research issues on microblog have aroused growing attention. Data fetching from microblog is the groundwork of these researches. In this paper we take Sina microblog (also called Weibo) as the crawling site, designing and implementing a high efficient incremental microblog crawler based on the classic multi-producers and multi-consumers model. Experimental results demonstrate that the crawler can collect real time microblog information efficiently and precisely.
Keywords :
Internet; Web sites; search engines; Sina microblog; data fetching; high efficient incremental microblogging crawler; microblog technology; multiconsumers model; multiproducer model; Bandwidth; Crawlers; Data mining; Engines; Schedules; Security; USA Councils; Incremental Crawling; Sina Microblog; Webpage Extraction;
Conference_Titel :
Multimedia Information Networking and Security (MINES), 2012 Fourth International Conference on
Conference_Location :
Nanjing
Print_ISBN :
978-1-4673-3093-0
DOI :
10.1109/MINES.2012.253