Title :
Data analyzing using Map-Join-Reduce in cloud storage
Author :
Bhardwaj, R. ; Mishra, N. ; Kumar, R.
Author_Institution :
Dept. of Comput. Sci. & Eng., Sharda Univ., Noida, India
Abstract :
Data analysis and maintenance in cloud computing is a challenging task which allows large volume of data to be processed in large clusters. Recent days Map Reduce Model have shown great value in processing huge amount of data on very large clusters. Map Reduce paradigm consists of two phases, mapper and reducer. Mapper performs filtering criteria and Reducer performs aggregation task, but Map Reduce supports a homogenous data set that signifies the same filtering logic is applied by mapper function on each tuple in the data set. However these techniques do not performed well in case of complex data analysis that may require the joining of multiple data sets. In order to improve these problems a CloudView framework has been proposed for data storage, processing and analyzing the massive machine data which are collected from cloud environment in which Case Based Reasoning (CBR) approach is used for fault prediction. In this paper, an Enhanced CloudView (ECV) framework has been proposed for data processing, maintenance and analyzing the massive machine data. CloudView is formulated by Map Reduce model whereas ECV framework will use Map-Join-Reduce model. This model will performs mapping-join-reduction task in two successive Map Reduce jobs. First it will filter the logic to all the datasets in parallel, joins the resulted tuple and further reduces for final aggregation and finally, it combines all partial aggregation results and produce the final result. This additional joiner model will perform a fast processing in a heterogeneous data set by using join reduce function, which will improve the efficiency and scalability of the system.
Keywords :
case-based reasoning; cloud computing; data analysis; storage management; CBR; ECV framework; aggregation task; case based reasoning; cloud computing; cloud environment; cloud storage; data analysis; data maintenance; data processing; data storage; enhanced CloudView framework; fault prediction; filtering criteria; map-join-reduce model; mapper; massive machine data; reducer; Barium; Bismuth; Conferences; Grid computing; Three-dimensional displays; Cloud computing; cloudview; extended cloudview; map join reduce; map reduce;
Conference_Titel :
Parallel, Distributed and Grid Computing (PDGC), 2014 International Conference on
Conference_Location :
Solan
Print_ISBN :
978-1-4799-7682-9
DOI :
10.1109/PDGC.2014.7030773