مرکز منطقه ای اطلاع رساني علوم و فناوري - IncMR: Incremental Data Processing Based on MapReduce

DocumentCode :

2785461

Title :

IncMR: Incremental Data Processing Based on MapReduce

Author :

Yan, Cairong ; Yang, Xin ; Yu, Ze ; Li, Min ; Li, Xiaolin

Author_Institution :

Dept. of Comput. Sci. & Technol., Donghua Univ., Shanghai, China

fYear :

2012

fDate :

24-29 June 2012

Firstpage :

534

Lastpage :

541

Abstract :

MapReduce programming model is widely used for large scale and one-time data-intensive distributed computing, but lacks flexibility and efficiency of processing small incremental data. IncMR framework is proposed in this paper for incrementally processing new data of a large data set, which takes state as implicit input and combines it with new data. Map tasks are created according to new splits instead of entire splits while reduce tasks fetch their inputs including the state and the intermediate results of new map tasks from designate nodes or local nodes. Data locality is considered as one of the main optimization means for job scheduling. It is implemented based on Hadoop, compatible with the original MapReduce interfaces and transparent to users. Experiments show that non-iterative algorithms running in MapReduce framework can be migrated to IncMR directly to get efficient incremental and continuous processing without any modification. IncMR is competitive and in all studied cases runs faster than that processing the entire data set.

Keywords :

data mining; distributed programming; scheduling; Hadoop; IncMR framework; Map tasks; MapReduce framework; MapReduce interfaces; MapReduce programming model; continuous processing; data locality; incremental data processing; job scheduling; local nodes; noniterative algorithms; one-time data-intensive distributed computing; Algorithm design and analysis; Computational modeling; Data models; Data processing; Distributed databases; Parallel processing; Programming; Compatible; Data locality; Incremental data processing; MapReduce; State;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Cloud Computing (CLOUD), 2012 IEEE 5th International Conference on

Conference_Location :

Honolulu, HI

ISSN :

2159-6182

Print_ISBN :

978-1-4673-2892-0

Type :

conf

DOI :

10.1109/CLOUD.2012.67

Filename :

6253548

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2785461