• DocumentCode
    3780376
  • Title

    Computational scalability with Apache Flume and Mahout for large scale round the clock analysis of sensor network data

  • Author

    P.B. Makeshwar;A. Kalra;N.S. Rajput;K.P. Singh

  • Author_Institution
    Department of Electronics Engineering, Indian Institute of Technology (BHU), Varanasi, India
  • fYear
    2015
  • Firstpage
    306
  • Lastpage
    311
  • Abstract
    In this paper, a typical scenario has been considered wherein gas sensor array responses from a WAN deployed sensor network are being received hourly, 24×7. From every sensor node, we are retrieving Static as well as Dynamic Responses with 16 sensing elements generating a .csv file of 9 MB size. Considering 1000 sensor nodes, the data received at the Hadoop Cluster at our Data Centre would be about 9 GB, which can be even more if more number of nodes, over larger geographical area and/or higher density of nodes is considered. Hence, (i) to receive and store such a huge data from a sensor network and (ii) to analyse the received data, we explored the suitability of Apache Flume and Apache Mahout to deliver high performance computational scalability on Hadoop Distributed File System. In this work, an implementation methodology for realization of such a scalable system has been presented by considering a sensor network for air pollution observation over a large geographical area, as an example.
  • Keywords
    "Ethanol","Wide area networks","Economics","Pollution measurement","Engines"
  • Publisher
    ieee
  • Conference_Titel
    Recent Advances in Electronics & Computer Engineering (RAECE), 2015 National Conference on
  • Type

    conf

  • DOI
    10.1109/RAECE.2015.7510212
  • Filename
    7510212