• DocumentCode
    160279
  • Title

    Sina microblog big data grabbing and analysis based on Multi-strategy model

  • Author

    Xiao Sun ; Jia-qi Ye ; Chen-yi Tang

  • Author_Institution
    Sch. of Comput. & Inf., Hefei Univ. of Technol., Hefei, China
  • fYear
    2014
  • fDate
    11-13 July 2014
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    As an important media for social interactions and information dissemination through the Internet, microblog contains emotional state and important opinion. Dealing with microblog data belongs to big data areas. The premise of which is to obtain a large amount of microblog data. For the commercial interests as well as security considerations, the access to the data is becoming increasingly difficult and the Sina officially open API doesn´t support large data mining. In this paper, we try to design a platform that is mainly based on the access mechanism of Multi-strategy and existing resources to collect data stably from Sina microblog. The results demonstrate that a combination of API and web crawler allows efficient data mining. In this way, we confirmed that custom solutions will be allowed to build straightforward application of hot words searching and analysis.
  • Keywords
    Internet; Web sites; data analysis; data mining; information dissemination; API; Internet; Sina microblog big data grabbing; Web crawler; information dissemination; large data mining; microblog data analysis; social interactions; Analytical models; Big data; Crawlers; Fans; Market research; Support vector machines; Training; Big Data Grabbing; Multi-strategy Model; Sentiment Analysis; Sina Microblog;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computing, Communication and Networking Technologies (ICCCNT), 2014 International Conference on
  • Conference_Location
    Hefei
  • Print_ISBN
    978-1-4799-2695-4
  • Type

    conf

  • DOI
    10.1109/ICCCNT.2014.6962998
  • Filename
    6962998