• DocumentCode
    2268137
  • Title

    Understanding How Non-uniform Distribution of Memory Accesses on Cache Sets Affects the System Performance of Chip Multiprocessors

  • Author

    Jia, Xiaomin ; Jiang, Jiang ; Ni, Xiaoqiang ; Zhao, Tianlei ; Qi, Shubo ; Fu, Guitao ; Zhang, Minxuan

  • Author_Institution
    Nat. Lab. for Parallel & Distrib. Process., Nat. Univ. of Defense Technol., Changsha, China
  • fYear
    2011
  • fDate
    26-28 May 2011
  • Firstpage
    266
  • Lastpage
    272
  • Abstract
    Non-uniform distribution of memory accesses across cache sets has been recognized as one of the sources of inefficiency of cache architecture on single-core platform. Several schemes target the problem for performance boost. As chip multiprocessors (CMPs) pick up steam as the mainstream processor design choice, how non-uniform distribution of memory accesses across cache sets affects the cache management of CMPs is becoming an open question. We address the question by presenting several cache management schemes on CMP platforms, aiming at balance the memory access distribution across cache sets on shared caches or private caches. We show that on CMP platforms with multi-programmed workloads: (a) for shared caches, the non-uniform memory access distribution across different cache sets is biased by the fact that multiple applications are running concurrently and sharing the cache capacity. The scheme, which we put forward to make use of the non-uniformity to improve performance on shared caches, is proved to be of little to no benefit or even lead to degradation, (b) for caches that are organized as private caches, direct adaption of a scheme that targets this kind of non-uniformity outperforms the baseline private cache design by 2% on average, (c) however, for a private cache based cache management scheme we proposed, further effort to take advantage of this kind of non-uniformity for performance boost (on top of our proposed scheme) is also proved to be of little to no benefit. Therefore, We draw to the conclusion that on CMP platforms with multiprogrammed workloads, the non-uniform distribution of memory accesses across cache sets is partially circumvented by the interactions between multiple applications. Efforts seeking to make use of the non-uniformity to derive more benefit may end up in vain in CMPs.
  • Keywords
    cache storage; memory architecture; multiprocessing systems; CMP platform; cache architecture; cache capacity sharing; chip multiprocessor; mainstream processor design; multiprogrammed workload; nonuniform distribution; uniform memory access distribution; Arrays; Benchmark testing; Hidden Markov models; Indexes; Memory management; Protocols; Throughput; chip multiprocessors (CMPs); non-first level cache; non-uniform distribution of memory accesses on cache sets;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing with Applications Workshops (ISPAW), 2011 Ninth IEEE International Symposium on
  • Conference_Location
    Busan
  • Print_ISBN
    978-1-4577-0524-3
  • Electronic_ISBN
    978-0-7695-4429-8
  • Type

    conf

  • DOI
    10.1109/ISPAW.2011.25
  • Filename
    5951986