Title :
A Time Based Analysis of Data Processing on Hadoop Cluster
Author :
Pal, Amrit ; Agrawal, Sanjay
Author_Institution :
Dept. of Comput. Eng. & Applic., Nat. Inst. of Tech. Teachers´ Training & Res. Bhopal, Bhopal, India
Abstract :
Data when it becomes in that much amount that it cannot be managed by the traditional database management system then it is Big data. It is difficult to manage this much amount of the data. Hadoop is a technological answer to the Big Data. Data storage and retrieval of information from the data is done by the Hadoop Distributed File System and the Map Reduce Programming model. MapReduce provides effective bench marks for retrieving the information from the Big Data. In this paper we present our experimental work done on the Hadoop Cluster. We have analyzed the time required by the cluster for processing the data with increasing number of nodes into the cluster. We started with a single node and then increase the node by one each time. We have analyzed three types of time. The real time, user time, system time is analyzed.
Keywords :
Big Data; information retrieval; storage management; Big Data; Hadoop cluster; Hadoop distributed file system; MapReduce programming model; data processing; data storage; information retrieval; real time; system time; time based analysis; user time; Big data; Distributed databases; File systems; Google; Real-time systems; Sorting; Data Node; Hadoop Distributed File System; Job Tracker; MapReduce; Name Node; Task Tracker;
Conference_Titel :
Computational Intelligence and Communication Networks (CICN), 2014 International Conference on
Print_ISBN :
978-1-4799-6928-9
DOI :
10.1109/CICN.2014.136