• DocumentCode
    117266
  • Title

    HAMLeT: Hardware accelerated memory layout transform within 3D-stacked DRAM

  • Author

    Akin, Berkin ; Hoe, James C. ; Franchetti, Franz

  • Author_Institution
    Electr. & Comput. Eng. Dept., Carnegie Mellon Univ., Pittsburgh, PA, USA
  • fYear
    2014
  • fDate
    9-11 Sept. 2014
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    Memory layout transformations via data reorganization are very common operations, which occur as a part of the computation or as a performance optimization in data-intensive applications. These operations require inefficient memory access patterns and roundtrip data movement through the memory hierarchy, failing to utilize the performance and energy-efficiency potentials of the memory subsystem. This paper proposes a high-bandwidth and energy-efficient hardware accelerated memory layout transform (HAMLeT) system integrated within a 3D-stacked DRAM. HAMLeT uses a low-overhead hardware that exploits the existing infrastructure in the logic layer of 3D-stacked DRAMs, and does not require any changes to the DRAM layers, yet it can fully exploit the locality and parallelism within the stack by implementing efficient layout transform algorithms. We analyze matrix layout transform operations (such as matrix transpose, matrix blocking and 3D matrix rotation) and demonstrate that HAMLeT can achieve close to peak system utilization, offering up to an order of magnitude performance improvement compared to the CPU and GPU memory subsystems which does not employ HAMLeT.
  • Keywords
    DRAM chips; 3D stacked DRAM layers; CPU memory subsystems; GPU memory subsystems; HAMLeT system; data intensive applications; data reorganization; energy efficiency potentials; hardware accelerated memory layout transform; layout transform algorithms; logic layer; magnitude performance improvement; matrix layout transform operations; memory hierarchy; memory layout transformations; parallelism; peak system utilization; performance optimization; roundtrip data movement; Bandwidth; Hardware; Layout; Parallel processing; Random access memory; Through-silicon vias; Transforms;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Extreme Computing Conference (HPEC), 2014 IEEE
  • Conference_Location
    Waltham, MA
  • Print_ISBN
    978-1-4799-6232-7
  • Type

    conf

  • DOI
    10.1109/HPEC.2014.7040954
  • Filename
    7040954