مرکز منطقه ای اطلاع رساني علوم و فناوري - Memory or Time: Performance Evaluation for Iterative Operation on Hadoop and Spark

DocumentCode :

688215

Title :

Memory or Time: Performance Evaluation for Iterative Operation on Hadoop and Spark

Author :

Lei Gu ; Huan Li

Author_Institution :

State Key Lab. of Software Dev. Environ., Beihang Univ., Beijing, China

fYear :

2013

fDate :

13-15 Nov. 2013

Firstpage :

721

Lastpage :

727

Abstract :

Hadoop is a very popular general purpose framework for many different classes of data-intensive applications. However, it is not good for iterative operations because of the cost paid for the data reloading from disk at each iteration. As an emerging framework, Spark, which is designed to have a global cache mechanism, can achieve better performance in response time since the in-memory access over the distributed machines of cluster will proceed during the entire iterative process. Although the performance on time has been evaluated for Spark over Hadoop, the memory consumption, another system performance criteria, is not deeply analyzed in the literature. In this work, we conducted extensive experiments for iterative operations to compare the performance in both time and memory cost between Hadoop and Spark. We found that although Spark is in general faster than Hadoop in iterative operations, it has to pay for more memory consumption. Also, its speed advantage is weakened at the moment when the memory is not sufficient enough to store newly created intermediate results.

Keywords :

cache storage; distributed processing; iterative methods; Hadoop framework; Spark framework; cache mechanism; data reloading; data-intensive applications; distributed machines; iterative operation; memory consumption; performance evaluation; Central Processing Unit; Computers; Distributed databases; Generators; Iterative methods; Sparks; Twitter; Hadoop; Iterative Operation; Spark; System Performance;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing (HPCC_EUC), 2013 IEEE 10th International Conference on

Conference_Location :

Zhangjiajie

Type :

conf

DOI :

10.1109/HPCC.and.EUC.2013.106

Filename :

6831988

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=688215