Title :
Improving CC-NUMA performance using Instruction-based Prediction
Author :
Kaxiras, Stefanos ; Goodman, James R.
Author_Institution :
Lucent Technol., Bell Labs., Murray Hill, NJ, USA
Abstract :
We propose Instruction-based Prediction as a means to optimize directory based cache coherent NUMA shared memory. Instruction-based prediction is based on observing the behavior of load and store instructions in relation to coherent events and predicting their future behavior. Although this technique is well established in the uniprocessor world, it has not been widely applied for optimizing transparent shared memory. Typically, in this environment, prediction is based on data block access history (address based prediction) in the form of adaptive cache coherence protocols. The advantage of instruction-based prediction is that it requires few hardware resources in the form of small prediction structures per node to match (or exceed) the performance of address based prediction. To show the potential of instruction-based prediction we propose and evaluate three different optimizations: i) a migratory sharing optimization, ii) a wide sharing optimization, and iii) a producer consumer optimization based on speculative execution. With execution driven simulation and a set of nine benchmarks we show that i) for the first two optimizations, instruction-based prediction, using few predictor entries per node, outpaces address based schemes, and (ii) for the producer consumer optimization which uses speculative execution, low mis speculation rates show promise for performance improvements
Keywords :
cache storage; instruction sets; parallel architectures; parallel programming; shared memory systems; CC-NUMA performance; Instruction-based Prediction; adaptive cache coherence protocols; address based prediction; address based schemes; coherent events; data block access history; directory based cache coherent NUMA shared memory; execution driven simulation; hardware resources; migratory sharing optimization; mis speculation rates; producer consumer optimization; small prediction structures; speculative execution; transparent shared memory; wide sharing optimization; Access protocols; Computer architecture; Design optimization; Hardware; History; Prefetching;
Conference_Titel :
High-Performance Computer Architecture, 1999. Proceedings. Fifth International Symposium On
Conference_Location :
Orlando, FL
Print_ISBN :
0-7695-0004-8
DOI :
10.1109/HPCA.1999.744359