DocumentCode
3769835
Title
Architecture of efficient word processing using Hadoop MapReduce for big data applications
Author
Bichitra Mandal;Srinivas Sethi;Ramesh Kumar Sahoo
Author_Institution
Dept. of CSEA, IGIT Sarang India
fYear
2015
Firstpage
1
Lastpage
6
Abstract
Understanding the characteristics of MapReduce workloads in a Hadoop, is the key in making optimal and efficient configuration decisions and improving the system efficiency. MapReduce is a very popular parallel processing framework for large-scale data analytics which has become an effective method for processing massive data by using cluster of computers. In the last decade, the amount of customers, services and information increasing rapidly, yielding the big data analysis problem for service systems. To keep up with the increasing volume of datasets, it requires efficient analytical capability to process and analyze data in two phases. They are mapping and reducing. Between mapping and reducing phases, MapReduce requires a shuffling to globally exchange the intermediate data generated by the mapping. In this paper, it is proposed a novel shuffling strategy to enable efficient data movement and reduce for MapReduce shuffling with number of consecutive words and their count in the word processor. To improve its scalability and efficiency of word processor in big data environment, repetition of consecutive words count with shuffling is implemented on Hadoop. It can be implemented in a widely-adopted distributed computing platform and also in single word processor big documents using the MapReduce parallel processing paradigm.
Keywords
"Big data","File systems","Text processing","Monitoring","Videos","Databases","Electronic mail"
Publisher
ieee
Conference_Titel
Man and Machine Interfacing (MAMI), 2015 International Conference on
Type
conf
DOI
10.1109/MAMI.2015.7456612
Filename
7456612
Link To Document