DocumentCode :
1758545
Title :
An Advanced MapReduce: Cloud MapReduce, Enhancements and Applications
Author :
Dahiphale, Devendra ; Karve, Rutvik ; Vasilakos, Athanasios V. ; Huan Liu ; Zhiwei Yu ; Chhajer, Amit ; Jianmin Wang ; Chaokun Wang
Author_Institution :
Pune Inst. of Comput. Technol., Pune, India
Volume :
11
Issue :
1
fYear :
2014
fDate :
41699
Firstpage :
101
Lastpage :
115
Abstract :
Recently, Cloud Computing is attracting great attention due to its provision of configurable computing resources. MapReduce (MR) is a popular framework for data-intensive distributed computing of batch jobs. MapReduce suffers from the following drawbacks: 1. It is sequential in its processing of Map and Reduce Phases 2. Being cluster based, its scalability is relatively limited. 3. It does not support flexible pricing. 4. It does not support stream data processing. We describe Cloud MapReduce (CMR), which overcomes these limitations. Our results show that CMR is more efficient and runs faster than other implementations of the MR framework. In addition to this, we showcase how CMR can be further enhanced to: 1. Support stream data processing in addition to batch data by parallelizing the Map and Reduce phases through a pipelining model. 2. Support flexible pricing using Amazon Cloud´s spot instances and to deal with massive machine terminations caused by spot price fluctuations. 3. Improve throughput and speed-up processing over traditional MR by more than 30% for large data sets. 4. Provide added flexibility and scalability by leveraging features of the cloud computing model. Click-stream analysis, real-time multimedia processing, time-sensitive analysis and other stream processing applications can also be supported.
Keywords :
cloud computing; financial data processing; multimedia systems; parallel programming; pipeline processing; pricing; Amazon Cloud´s spot instances; CMR; Cloud MapReduce; Map phase parallelization; Reduce phase parallelization; batch data; batch jobs; click-stream analysis; cloud computing model; configurable computing resources; data-intensive distributed computing; flexible pricing; massive machine terminations; pipelining model; real-time multimedia processing; speed-up processing; spot price fluctuations; stream data processing; throughput processing; time-sensitive analysis; Clouds; Computer architecture; Data models; Fault tolerance; Message systems; Pipeline processing; Web services; Cloud computing; MapReduce; pipelining; spot market; stream processing;
fLanguage :
English
Journal_Title :
Network and Service Management, IEEE Transactions on
Publisher :
ieee
ISSN :
1932-4537
Type :
jour
DOI :
10.1109/TNSM.2014.031714.130407
Filename :
6805345
Link To Document :
بازگشت