مرکز منطقه ای اطلاع رساني علوم و فناوري - A Performance Analysis of MapReduce Task with Large Number of Files Dataset in Big Data Using Hadoop

DocumentCode :

144527

Title :

A Performance Analysis of MapReduce Task with Large Number of Files Dataset in Big Data Using Hadoop

Author :

Pal, Arnab ; Jain, Kunal ; Agrawal, Pulin ; Agrawal, Sanjay

Author_Institution :

Dept. of Comput. Eng. & Applic., Nat. Inst. of Tech. Teachers´ Training & Res., Bhopal, India

fYear :

2014

fDate :

7-9 April 2014

Firstpage :

587

Lastpage :

591

Abstract :

Big Data is a huge amount of data that cannot be managed by the traditional data management system. Hadoop is a technological answer to Big Data. Hadoop Distributed File System (HDFS) and MapReduce programming model is used for storage and retrieval of the big data. The Tera Bytes size file can be easily stored on the HDFS and can be analyzed with MapReduce. This paper provides introduction to Hadoop HDFS and MapReduce for storing large number of files and retrieve information from these files. In this paper we present our experimental work done on Hadoop by applying a number of files as input to the system and then analyzing the performance of the Hadoop system. We have studied the amount of bytes written and read by the system and by the MapReduce. We have analyzed the behavior of the map method and the reduce method with increasing number of files and the amount of bytes written and read by these tasks.

Keywords :

Big Data; distributed databases; information retrieval; storage management; Hadoop HDFS; Hadoop distributed file system; MapReduce programming model; big data retrieval; big data storage; data management system; files dataset; information retrieval; performance analysis; Computers; Distributed databases; File systems; Google; Programming; Training; Data Node; HDFS; Hadoop; Job Tracker; MapReduce; Name Node; Secondary Name Node; Task Tracker; Teragen; Terasort; Teravalidate;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Communication Systems and Network Technologies (CSNT), 2014 Fourth International Conference on

Conference_Location :

Bhopal

Print_ISBN :

978-1-4799-3069-2

Type :

conf

DOI :

10.1109/CSNT.2014.124

Filename :

6821465

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=144527