• DocumentCode
    262288
  • Title

    A Paralleled Big Data Algorithm with MapReduce Framework for Mining Twitter Data

  • Author

    Li Bing ; Chan, Keith C. C.

  • Author_Institution
    Dept. of Comput., Hong Kong Polytech. Univ., Kowloon, China
  • fYear
    2014
  • fDate
    3-5 Dec. 2014
  • Firstpage
    121
  • Lastpage
    128
  • Abstract
    Some recent studies have suggested that public opinions expressed in social media may be correlated with various social issues. To find out what actually can be discovered in social media data, we need data mining. Data mining approaches that can handle massive amount of data have recently been referred to as big data algorithms. In this paper, we propose a big data algorithm to handling Twitter data mining. Furthermore, to ensure scalability, MapReduce framework is adopted to parallelize the proposed algorithm. Through the experiments, the potential of the proposed algorithm can be demonstrated. Computationally, the speed of execution can be shown to increase significantly despite increases in data set size. In fact, the acceleration ratio increases as the size of the dataset increases, and as the number of Data Nodes increases.
  • Keywords
    Big Data; data mining; parallel algorithms; social networking (online); DataNodes; MapReduce framework; Twitter data mining; acceleration ratio; big handling; data set size; paralleled big data algorithm; public opinions; social media data; Accuracy; Big data; Data mining; Media; Pragmatics; Twitter; Vectors; MapReduce; Twitter; big data algorithm; data mining; social media;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Big Data and Cloud Computing (BdCloud), 2014 IEEE Fourth International Conference on
  • Conference_Location
    Sydney, NSW
  • Type

    conf

  • DOI
    10.1109/BDCloud.2014.26
  • Filename
    7034776