• DocumentCode
    2872050
  • Title

    The Design and Implement of High Efficient Incremental Microblogging Crawler

  • Author

    Dayong Shen ; Hui Wang ; Jianping Cao ; Pei Li ; Zhihong Jiang

  • Author_Institution
    Res. Center of Comput. Experiments & Parallel Syst. Technol., Nat. Univ. of Defense Technol., Changsha, China
  • fYear
    2012
  • fDate
    2-4 Nov. 2012
  • Firstpage
    537
  • Lastpage
    540
  • Abstract
    With the rapid development of microblog technology, many interesting research issues on microblog have aroused growing attention. Data fetching from microblog is the groundwork of these researches. In this paper we take Sina microblog (also called Weibo) as the crawling site, designing and implementing a high efficient incremental microblog crawler based on the classic multi-producers and multi-consumers model. Experimental results demonstrate that the crawler can collect real time microblog information efficiently and precisely.
  • Keywords
    Internet; Web sites; search engines; Sina microblog; data fetching; high efficient incremental microblogging crawler; microblog technology; multiconsumers model; multiproducer model; Bandwidth; Crawlers; Data mining; Engines; Schedules; Security; USA Councils; Incremental Crawling; Sina Microblog; Webpage Extraction;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia Information Networking and Security (MINES), 2012 Fourth International Conference on
  • Conference_Location
    Nanjing
  • Print_ISBN
    978-1-4673-3093-0
  • Type

    conf

  • DOI
    10.1109/MINES.2012.253
  • Filename
    6405612