• DocumentCode
    668160
  • Title

    K MapReduce: A scalable tool for data-processing and search/ensemble applications on large-scale supercomputers

  • Author

    Matsuda, Manabu ; Maruyama, Naoya ; Takizawa, Shun

  • Author_Institution
    RIKEN AICS, Kobe, Japan
  • fYear
    2013
  • fDate
    23-27 Sept. 2013
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    K MapReduce (KMR) is a high-performance MapReduce system in the MPI environment, targeting large-scale supercomputers such as the K computer. Its objectives are to ease programming for data-processing and to achieve efficiency by utilizing the large amount of memory available in large-scale supercomputers. In KMR, shuffling operation exchanges key-value pairs in a scalable way by collective communication algorithms utilizing the K´s interconnect. Mapping and reducing operations are multi-threaded to achieve even greater efficiency in modern multi-core machines. Sorting is optimized using fixed-length packed keys instead of variable-length raw keys, which is extensively used inside of shuffling and reducing operations. Besides the MapReduce operations, KMR provides routines for collective file reading for affinity-aware optimizations. This paper presents the results of experimental performance studies of KMR on the K computer. Affinity-aware file loading improves the performance by about 42% over a non-optimized implementation. We also show how KMR can be used to program real-world scientific applications such as meta-genome search and replica-exchange molecular dynamics.
  • Keywords
    multiprocessing systems; parallel machines; parallel programming; search problems; sorting; K MapReduce; KMR; MPI environment; affinity-aware optimization; data-processing; ensemble application; fixed-length packed keys; large-scale supercomputer; meta-genome search; multicore machines; reducing operation; replica-exchange molecular dynamics; scalable tool; search application; shuffling operation; sorting; variable-length raw keys; Ions; Loading; Programming; Supercomputers;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing (CLUSTER), 2013 IEEE International Conference on
  • Conference_Location
    Indianapolis, IN
  • Type

    conf

  • DOI
    10.1109/CLUSTER.2013.6702663
  • Filename
    6702663