DocumentCode :
1380306
Title :
Design and evaluation of a switch cache architecture for CC-NUMA multiprocessors
Author :
Iyer, Ravishankar R. ; Bhuyan, Laxmi N.
Author_Institution :
Intel Corp., Beaverton, OR, USA
Volume :
49
Issue :
8
fYear :
2000
fDate :
8/1/2000 12:00:00 AM
Firstpage :
779
Lastpage :
797
Abstract :
Cache coherent nonuniform memory access (CC-NUMA) multiprocessors provide a scalable design for shared memory. But, they continue to suffer from large remote memory access latencies due to comparatively slow memory technology and large data transfer latencies in the interconnection network. In this paper, we propose a novel hardware caching technique, called switch cache, to improve the remote memory access performance of CC-NUMA multiprocessors. The main idea is to implement small fast caches in crossbar switches of the interconnect medium to capture and store shared data as they flow from the memory module to the requesting processor. This stored data acts as a cache for subsequent requests, thus reducing the need for remote memory accesses tremendously. The implementation of a cache in a crossbar switch needs to be efficient and robust, yet flexible for changes in the caching protocol. The design and implementation details of a CAche Embedded Switch ARchitecture, CAESAR, using wormhole routing with virtual channels is presented. We explore the design space of switch caches by modeling CAESAR in a detailed execution driven simulator and analyze the performance benefits. Our results show that the CAESAR switch cache is capable of improving the performance of CC-NUMA multiprocessors by up to 45 percent reduction in remote memory accesses for some applications. By serving remote read requests at various stages in the interconnect, we observe improvements in execution time as high as 20 percent for these applications. We conclude that switch caches provide a cost-effective solution for designing high performance CC-NUMA multiprocessors
Keywords :
multiprocessor interconnection networks; network routing; performance evaluation; protocols; shared memory systems; CAESAR; CC-NUMA multiprocessors; cache coherent nonuniform memory access multiprocessors; caching protocol; crossbar switches; interconnection network; memory access latencies; switch cache architecture; virtual channels; wormhole routing; Access protocols; Analytical models; Delay; Hardware; Multiprocessor interconnection networks; Performance analysis; Robustness; Routing; Space exploration; Switches;
fLanguage :
English
Journal_Title :
Computers, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9340
Type :
jour
DOI :
10.1109/12.868025
Filename :
868025
Link To Document :
بازگشت