• DocumentCode
    3767828
  • Title

    Big data emerging technologies: A CaseStudy with analyzing twitter data using apache hive

  • Author

    Aditya Bhardwaj; Vanraj;Ankit Kumar;Yogendra Narayan;Pawan Kumar

  • Author_Institution
    Computer Science & Engineering Department, National Institute of Technical Teachers Training and Research, Chandigarh, India
  • fYear
    2015
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    These are the days of Growth and Innovation for a better future. Now-a-days companies are bound to realize need of Big Data to make decision over complex problem. Big Data is a term that refers to collection of large datasets containing massive amount of data whose size is in the range of Petabytes, Zettabytes, or with high rate of growth, and complexity that make them difficult to process and analyze using conventional database technologies. Big Data is generated from various sources such as social networking sites like Facebook, Twitter etc, and the data that is generated can be in various formats like structured, semi-structured or unstructured format. For extracting valuable information from this huge amount of Data, new tools and techniques is a need of time for the organizations to derive business benefits and to gain competitive advantage over the market. In this paper a comprehensive study of major Big Data emerging technologies by highlighting their important features and how they work, with a comparative study between them is presented. This paper also represents performance analysis of Apache Hive query for executing Twitter tweets in order to calculate Map Reduce CPU time spent and total time taken to finish the job.
  • Keywords
    "Big data","Twitter","File systems","Computer architecture","Google","Servers","Writing"
  • Publisher
    ieee
  • Conference_Titel
    Recent Advances in Engineering & Computational Sciences (RAECS), 2015 2nd International Conference on
  • Type

    conf

  • DOI
    10.1109/RAECS.2015.7453400
  • Filename
    7453400