• DocumentCode
    1578567
  • Title

    Multicore Cache Simulations Using Heterogeneous Computing on General Purpose and Graphics Processors

  • Author

    Keramidas, Georgios ; Strikos, Nikolaos ; Kaxiras, Stefanos

  • Author_Institution
    Ind. Syst. Inst., Patras, Greece
  • fYear
    2011
  • Firstpage
    270
  • Lastpage
    273
  • Abstract
    Traditional trace-driven memory system simulation is a very time consuming process while the advent of multicores simply exacerbates the problem. We propose a framework for accelerating trace-driven multicore cache simulations by utilizing the capabilities of the modern many core GPUs. A straightforward way towards this direction is to rely on the inherent parallelism in cache simulations: communicating cache sets can be simulated independently and concurrently to other sets. Based on this, we map collections of communicating cache sets (each belonging to a different target cache) on the same GPU block so that the simulated coherence traffic is local traffic in the GPU. However, this is not enough due to the great imbalance in the activity in the different cache sets: some sets receive a flurry of activity while others do not. Our solution is to load balance the simulated sets (based on activity) on the computing element (host-CPU or GPU) that can manage them in the most efficient way. We propose a heterogeneous computing approach in which the host-CPU simulates the few but most active sets, while the GPU is responsible for the many more but less active sets. Our experimental findings using the SPLASH-2 suite demonstrate that our cache simulator based on the CPU-GPU cooperation achieves on average 5.88x speedup over alternative implementations running on CPU, speedups which scale well with the size of the simulated system.
  • Keywords
    cache storage; coprocessors; multiprocessing systems; resource allocation; SPLASH-2 suite; cache simulator; coherence traffic simulation; communicating cache sets; general purpose processor; graphics processor; heterogeneous computing approach; host-CPU; load balancing; many core GPU; trace-driven memory system simulation; trace-driven multicore cache simulation; Benchmark testing; Computational modeling; Graphics processing unit; Instruction sets; Integrated circuit modeling; Multicore processing; GPUs; Multicores; Trace-driven simulation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Digital System Design (DSD), 2011 14th Euromicro Conference on
  • Conference_Location
    Oulu
  • Print_ISBN
    978-1-4577-1048-3
  • Type

    conf

  • DOI
    10.1109/DSD.2011.38
  • Filename
    6037420