• DocumentCode
    167598
  • Title

    SupMR: Circumventing Disk and Memory Bandwidth Bottlenecks for Scale-up MapReduce

  • Author

    Sevilla, Michael ; Nassi, Ike ; Ioannidou, Kleoni ; Brandt, Scott ; Maltzahn, Carlos

  • Author_Institution
    Comput. Sci. Dept., Univ. of California, Santa Cruz, Santa Cruz, CA, USA
  • fYear
    2014
  • fDate
    19-23 May 2014
  • Firstpage
    1505
  • Lastpage
    1514
  • Abstract
    Reading input from primary storage (i.e. the ingest phase) and aggregating results (i.e. the merge phase) are important pre- and post-processing steps in large batch computations. Unfortunately, today´s data sets are so large that the ingest and merge job phases are now performance bottlenecks. In this paper, we mitigate the ingest and merge bottlenecks by leveraging the scale-up MapReduce model. We introduce an ingest chunk pipeline and a merge optimization that increases CPU utilization (50-100%) and job phase speedups (1.16× - 3.13×) for the ingest and merge phases. Our techniques are based on well-known algorithms and scale-out MapReduce optimizations, but applying them to a scale-up computation framework to mitigate the ingest and merge bottlenecks is novel.
  • Keywords
    data handling; parallel processing; disk bottlenecks; ingest chunk pipeline; memory bandwidth bottlenecks; merge bottlenecks; merge optimization; scale-up MapReduce model; Aggregates; Computational modeling; Containers; Instruction sets; Merging; Pipelines; Runtime; applications; architectures; distributed applications; distributed systems; performance measurements;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International
  • Conference_Location
    Phoenix, AZ
  • Print_ISBN
    978-1-4799-4117-9
  • Type

    conf

  • DOI
    10.1109/IPDPSW.2014.168
  • Filename
    6969554