• DocumentCode
    2532766
  • Title

    Run-Time Partitioning of Hybrid Distributed Shared Memory on Multi-core Network-on-Chips

  • Author

    Chen, Xiaowen ; Zhonghai Lu ; Jantsch, Axel ; Shuming Chen

  • Author_Institution
    Inst. of Microelectron. & Microprocessor, Nat. Univ. of Defense Technol., Changsha, China
  • fYear
    2010
  • fDate
    18-20 Dec. 2010
  • Firstpage
    39
  • Lastpage
    46
  • Abstract
    On multi-core Network-on-Chips (NoCs), memories are preferably distributed and supporting Distributed Shared Memory (DSM) is essential for the sake of reusing huge amount of legacy code and easy programming. However, the DSM organization imports the inherent overhead of translating virtual memory addresses into physical memory addresses, resulting in negative performance. We observe that, in parallel applications, different data have different properties (private or shared). For the private data accesses, it´s unnecessary to perform Virtual-to-Physical address translations. Even for the same datum, its property may be changeable in different phases of the program execution. Therefore, this paper focuses on decreasing the overhead of Virtual-to-Physical address translation and hence improving the system performance by introducing hybrid DSM organization and supporting run-time partitioning according to the data property. The hybrid DSM organization aims at supporting fast and physical memory accesses for private data and maintaining a global and single virtual memory space for shared data. Based on the data property of parallel applications, the run-time partitioning supports changing the hybrid DSM organization during the program execution. It ensures fast physical memory addressing on private data and conventional virtual memory addressing on shared data, improving the performance of the entire system by reducing virtual-to-physical address translation overhead as much as possible. We formulate the run-time partitioning of hybrid DSM organization in order to analyze its performance. A real DSM based multi-core NoC platform is also constructed. The experimental results of real applications show that the hybrid DSM organization with run-time partitioning demonstrates performance advantage over the conventional DSM counterpart. The percentage of performance improvement depends on problem size, way of data partitioning and computation/communication ratio of parallel a- - pplications, network size of the system, etc. In our experiments, the maximal improvement is 34.42%, the minimal improvement 3.68%.
  • Keywords
    distributed shared memory systems; network-on-chip; data property; hybrid DSM organization; hybrid distributed shared memory; legacy code; multicore network-on-chips; run-time partitioning; virtual-to-physical address translations; Delay; Memory management; Organizations; Programming; Registers; System performance; System-on-a-chip; Hybrid Distributed Shared Memory (DSM); Multi-core; Network-on-Chips (NoCs); Run-time Partitioning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Architectures, Algorithms and Programming (PAAP), 2010 Third International Symposium on
  • Conference_Location
    Dalian
  • Print_ISBN
    978-1-4244-9482-8
  • Type

    conf

  • DOI
    10.1109/PAAP.2010.21
  • Filename
    5715060