DocumentCode
160279
Title
Sina microblog big data grabbing and analysis based on Multi-strategy model
Author
Xiao Sun ; Jia-qi Ye ; Chen-yi Tang
Author_Institution
Sch. of Comput. & Inf., Hefei Univ. of Technol., Hefei, China
fYear
2014
fDate
11-13 July 2014
Firstpage
1
Lastpage
6
Abstract
As an important media for social interactions and information dissemination through the Internet, microblog contains emotional state and important opinion. Dealing with microblog data belongs to big data areas. The premise of which is to obtain a large amount of microblog data. For the commercial interests as well as security considerations, the access to the data is becoming increasingly difficult and the Sina officially open API doesn´t support large data mining. In this paper, we try to design a platform that is mainly based on the access mechanism of Multi-strategy and existing resources to collect data stably from Sina microblog. The results demonstrate that a combination of API and web crawler allows efficient data mining. In this way, we confirmed that custom solutions will be allowed to build straightforward application of hot words searching and analysis.
Keywords
Internet; Web sites; data analysis; data mining; information dissemination; API; Internet; Sina microblog big data grabbing; Web crawler; information dissemination; large data mining; microblog data analysis; social interactions; Analytical models; Big data; Crawlers; Fans; Market research; Support vector machines; Training; Big Data Grabbing; Multi-strategy Model; Sentiment Analysis; Sina Microblog;
fLanguage
English
Publisher
ieee
Conference_Titel
Computing, Communication and Networking Technologies (ICCCNT), 2014 International Conference on
Conference_Location
Hefei
Print_ISBN
978-1-4799-2695-4
Type
conf
DOI
10.1109/ICCCNT.2014.6962998
Filename
6962998
Link To Document