Title :
Distributed cooperative shared last-level caching in tiled multiprocessor system on chip
Author :
Damodaran, Preethi P. ; Wallentowitz, Stefan ; Herkersdorf, Andreas
Author_Institution :
Lehrstuhl fur Integrierte Syst., Tech. Univ. Munchen, Munich, Germany
Abstract :
In a shared-memory based tiled many-core system-on-chip architecture, memory accesses present a huge performance bottleneck in terms of access latency as well as bandwidth requirements. The best practice approach to address this issue is to provide a multi-level cache hierarchy and a suitable cache-coherency mechanism. This paper presents a method to increase the memory access performance in distributed-directory-coherency-protocol based tiled many-core systems. The proposed method introduces an alternate design for the system-wide shared last-level caches (LLC) placed between the memory and the node private caches (NPC). The proposed system-wide shared LLC layer is distributed over the entire network and it interacts with the home directories of specific cache lines. Results from simulating SPEC2000 benchmark applications executed on a SystemC model of the proposed design show a minimum performance improvement of 20-25% when compared to a model without the shared cache layer at the expense of an additional 2% of the total cache memory space (NPC + LLC memory). In addition, the proposed design shows a minimum 7-15% and an average 14-15% improvement in performance in comparison to centralized system-wide shared LLC of equivalent size and dynamic mapped distributed LLC of equivalent size respectively.
Keywords :
cache storage; distributed shared memory systems; memory protocols; system-on-chip; NPC; SPEC2000 benchmark applications; SystemC model; access latency; cache lines; cache-coherency mechanism; centralized system-wide shared LLC; distributed cooperative shared last-level caching; distributed-directory-coherency-protocol; dynamic mapped distributed LLC; home directories; memory accesses; multilevel cache hierarchy; node private caches; shared cache layer; shared-memory; system-wide shared last-level caches; tiled many-core system-on-chip architecture; tiled multiprocessor system-on-chip; total cache memory space; Analytical models; Coherence; Computational modeling; Data models; Delays; Mathematical model; Tiles;
Conference_Titel :
Design, Automation and Test in Europe Conference and Exhibition (DATE), 2014
Conference_Location :
Dresden
DOI :
10.7873/DATE.2014.096