• DocumentCode
    611061
  • Title

    Partially Separated Page Tables for Efficient Operating System Assisted Hierarchical Memory Management on Heterogeneous Architectures

  • Author

    Gerofi, B. ; Shimada, Akira ; Hori, A. ; Ishikawa, Yozo

  • Author_Institution
    RIKEN Adv. Inst. for Comput. Sci., Kobe, Japan
  • fYear
    2013
  • fDate
    13-16 May 2013
  • Firstpage
    360
  • Lastpage
    368
  • Abstract
    Heterogeneous architectures, where a multicore processor is accompanied with a large number of simpler, but more power-efficient CPU cores optimized for parallel workloads, are receiving a lot of attention recently. At present, these co-processors, such as the Intel Xeon Phi product family, come with limited on-board memory, which requires partitioning computational problems manually into pieces that can fit into the device´s RAM, as well as efficiently overlapping computation and communication. In this paper we propose an application transparent, operating system (OS)assisted hierarchical memory management system, where the OS orchestrates data movement between the host and the device and updates the process virtual memory address space accordingly. We identify the main scalability issues of frequent address space changes, such as the increasing price of TLB invalidations with the growing number of CPU cores, and propose partially separated page tables with address-range CPU masks to overcome the problem. With partially separated page tables each core maintains its own set of mappings of the computation area, enabling the OS to perform address space updates in a scalable manner, and involve a particular CPU core in TLB invalidation only if it is absolutely necessary. Furthermore, we propose dedicated data movement cores in order to efficiently overlap computation and communication. We provide experimental results on stencil computation, a common HPCkernel, and show that OS assisted memory management has the potential for scalable transparent data movement.
  • Keywords
    multiprocessing systems; operating systems (computers); storage management; TLB invalidations; address-range CPU masks; data movement; heterogeneous architectures; multicore processor; operating system assisted hierarchical memory management; parallel workloads; partitioning computational problems; power-efficient CPU cores; process virtual memory address space; separated page tables; Instruction sets; Kernel; Memory management; Multicore processing; Random access memory; coprocessor; manycore; memory management; operating systems; page tables;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster, Cloud and Grid Computing (CCGrid), 2013 13th IEEE/ACM International Symposium on
  • Conference_Location
    Delft
  • Print_ISBN
    978-1-4673-6465-2
  • Type

    conf

  • DOI
    10.1109/CCGrid.2013.59
  • Filename
    6546113