DocumentCode
2540699
Title
Improving CC-NUMA performance using Instruction-based Prediction
Author
Kaxiras, Stefanos ; Goodman, James R.
Author_Institution
Lucent Technol., Bell Labs., Murray Hill, NJ, USA
fYear
1999
fDate
9-13 Jan 1999
Firstpage
161
Lastpage
170
Abstract
We propose Instruction-based Prediction as a means to optimize directory based cache coherent NUMA shared memory. Instruction-based prediction is based on observing the behavior of load and store instructions in relation to coherent events and predicting their future behavior. Although this technique is well established in the uniprocessor world, it has not been widely applied for optimizing transparent shared memory. Typically, in this environment, prediction is based on data block access history (address based prediction) in the form of adaptive cache coherence protocols. The advantage of instruction-based prediction is that it requires few hardware resources in the form of small prediction structures per node to match (or exceed) the performance of address based prediction. To show the potential of instruction-based prediction we propose and evaluate three different optimizations: i) a migratory sharing optimization, ii) a wide sharing optimization, and iii) a producer consumer optimization based on speculative execution. With execution driven simulation and a set of nine benchmarks we show that i) for the first two optimizations, instruction-based prediction, using few predictor entries per node, outpaces address based schemes, and (ii) for the producer consumer optimization which uses speculative execution, low mis speculation rates show promise for performance improvements
Keywords
cache storage; instruction sets; parallel architectures; parallel programming; shared memory systems; CC-NUMA performance; Instruction-based Prediction; adaptive cache coherence protocols; address based prediction; address based schemes; coherent events; data block access history; directory based cache coherent NUMA shared memory; execution driven simulation; hardware resources; migratory sharing optimization; mis speculation rates; producer consumer optimization; small prediction structures; speculative execution; transparent shared memory; wide sharing optimization; Access protocols; Computer architecture; Design optimization; Hardware; History; Prefetching;
fLanguage
English
Publisher
ieee
Conference_Titel
High-Performance Computer Architecture, 1999. Proceedings. Fifth International Symposium On
Conference_Location
Orlando, FL
Print_ISBN
0-7695-0004-8
Type
conf
DOI
10.1109/HPCA.1999.744359
Filename
744359
Link To Document