Title :
SupMR: Circumventing Disk and Memory Bandwidth Bottlenecks for Scale-up MapReduce
Author :
Sevilla, Michael ; Nassi, Ike ; Ioannidou, Kleoni ; Brandt, Scott ; Maltzahn, Carlos
Author_Institution :
Comput. Sci. Dept., Univ. of California, Santa Cruz, Santa Cruz, CA, USA
Abstract :
Reading input from primary storage (i.e. the ingest phase) and aggregating results (i.e. the merge phase) are important pre- and post-processing steps in large batch computations. Unfortunately, today´s data sets are so large that the ingest and merge job phases are now performance bottlenecks. In this paper, we mitigate the ingest and merge bottlenecks by leveraging the scale-up MapReduce model. We introduce an ingest chunk pipeline and a merge optimization that increases CPU utilization (50-100%) and job phase speedups (1.16× - 3.13×) for the ingest and merge phases. Our techniques are based on well-known algorithms and scale-out MapReduce optimizations, but applying them to a scale-up computation framework to mitigate the ingest and merge bottlenecks is novel.
Keywords :
data handling; parallel processing; disk bottlenecks; ingest chunk pipeline; memory bandwidth bottlenecks; merge bottlenecks; merge optimization; scale-up MapReduce model; Aggregates; Computational modeling; Containers; Instruction sets; Merging; Pipelines; Runtime; applications; architectures; distributed applications; distributed systems; performance measurements;
Conference_Titel :
Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International
Conference_Location :
Phoenix, AZ
Print_ISBN :
978-1-4799-4117-9
DOI :
10.1109/IPDPSW.2014.168