Title :
Parallel cachelets
Author :
Limaye, Deepak ; Rakvic, Ryan ; Shen, John P.
Author_Institution :
Dept. of Electr. & Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA, USA
fDate :
6/23/1905 12:00:00 AM
Abstract :
A low-latency and high-bandwidth level-1 data cache is crucial for achieving high performance in future superscalar microprocessors. The parallel cachelets (PC) proposed in this paper provide bandwidth close to that of a multi-ported cache with implementation efficiency close to that of a multi-banked cache. In the PC scheme, the traditional level-1 data cache is replaced by a set of parallel single-ported independent caches, or cachelets. Similar to the interleaved multi-banked design, the cachelets can be concurrently accessed to provide bandwidth. However, instead of mapping data elements to the banks in an interleaved fashion based on address bits, they are dynamically assigned to cachelets based on the pattern of concurrent accesses, thus many bank conflicts can be eliminated. Furthermore, the PC scheme exhibits the attribute of implicit set associativity that allows it to outperform a direct-mapped multi-ported cache for some benchmarks. The PC scheme outperforms the multi-banked scheme by an average of 6% (IPC) across a set of SPEC95 benchmarks, and comes very close to matching the performance of the multi ported scheme. When cache access latency is taken into account, the PC scheme even outperforms the multi ported scheme by 6.4%
Keywords :
cache storage; parallel memories; performance evaluation; SPEC95 benchmarks; bank conflicts; cache access latency; concurrent access; concurrent access pattern; data element dynamic assignment; implementation efficiency; implicit set associativity; inter-processor communication; low-latency high-bandwidth level-1 data cache; multi-banked cache; multi-ported cache; parallel cachelets; parallel single-ported independent caches; superscalar microprocessor performance; Bandwidth; Concurrent computing; Costs; Data engineering; Delay; Hardware; High performance computing; Microprocessors; Performance gain; System performance;
Conference_Titel :
Computer Design, 2001. ICCD 2001. Proceedings. 2001 International Conference on
Conference_Location :
Austin, TX
Print_ISBN :
0-7695-1200-3
DOI :
10.1109/ICCD.2001.955041