Title :
Sina microblog big data grabbing and analysis based on Multi-strategy model
Author :
Xiao Sun ; Jia-qi Ye ; Chen-yi Tang
Author_Institution :
Sch. of Comput. & Inf., Hefei Univ. of Technol., Hefei, China
Abstract :
As an important media for social interactions and information dissemination through the Internet, microblog contains emotional state and important opinion. Dealing with microblog data belongs to big data areas. The premise of which is to obtain a large amount of microblog data. For the commercial interests as well as security considerations, the access to the data is becoming increasingly difficult and the Sina officially open API doesn´t support large data mining. In this paper, we try to design a platform that is mainly based on the access mechanism of Multi-strategy and existing resources to collect data stably from Sina microblog. The results demonstrate that a combination of API and web crawler allows efficient data mining. In this way, we confirmed that custom solutions will be allowed to build straightforward application of hot words searching and analysis.
Keywords :
Internet; Web sites; data analysis; data mining; information dissemination; API; Internet; Sina microblog big data grabbing; Web crawler; information dissemination; large data mining; microblog data analysis; social interactions; Analytical models; Big data; Crawlers; Fans; Market research; Support vector machines; Training; Big Data Grabbing; Multi-strategy Model; Sentiment Analysis; Sina Microblog;
Conference_Titel :
Computing, Communication and Networking Technologies (ICCCNT), 2014 International Conference on
Conference_Location :
Hefei
Print_ISBN :
978-1-4799-2695-4
DOI :
10.1109/ICCCNT.2014.6962998