• DocumentCode
    166663
  • Title

    Toward the efficient use of multiple explicitly managed memory subsystems

  • Author

    Pena, A.J. ; Balaji, Pavan

  • Author_Institution
    Math. & Comput. Sci. Div., Argonne Nat. Lab., Argonne, IL, USA
  • fYear
    2014
  • fDate
    22-26 Sept. 2014
  • Firstpage
    123
  • Lastpage
    131
  • Abstract
    The increasing number of memory technologies offering different features such as optimized access patterns or capacity/speed ratios lead us to advocate for future HPC compute nodes equipped with heterogeneous memory subsystems. The aim is to alleviate further the ever-increasing gap between computation and memory access speeds, by taking advantage of the benefits these memory technologies provide. Compute nodes equipped with memory technologies such as scratchpad memory, on-chip 3D-stacked memory, or NVRAM-based memory are already a reality. Careful use of the different memory subsystems is mandatory in order to exploit the potential of such super-computers. While most multiple-memory models concentrate on extending the depth of the memory hierarchy by incorporating more levels of hardware-managed memories, we advocate for compute nodes equipped with heterogeneous software-managed memory subsystems. Although the exact approach to efficiently exploit them is still uncertain, a software ecosystem clearly is required in order to assist in an efficient data distribution. We address this problem at the memory object granularity. In this paper we use an object-differentiated profiling tool we have developed on top of the Valgrind instrumentation framework, in order to assess the most suitable memory subsystem for the different memory objects of two miniapplications from the Mantevo codesign project. Our results considering two different memory configurations as use cases reveal the potential benefits of carefully placing the different memory objects of an application among the different memory subsystems.
  • Keywords
    object-oriented methods; parallel processing; storage management; HPC compute nodes; NVRAM-based memory; Valgrind instrumentation framework; access patterns; capacity-speed ratio; data distribution; heterogeneous memory subsystems; heterogeneous software-managed memory subsystems; high performance computing; memory configuration; memory hierarchy; memory technologies; nonvolatile random access memory; object-differentiated profiling tool; software ecosystem; Computational modeling; Hardware; Instruments; Memory management; Nonvolatile memory; Random access memory; System-on-chip;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing (CLUSTER), 2014 IEEE International Conference on
  • Conference_Location
    Madrid
  • Type

    conf

  • DOI
    10.1109/CLUSTER.2014.6968756
  • Filename
    6968756