DocumentCode
2572924
Title
The use of prediction for accelerating upgrade misses in cc-NUMA multiprocessors
Author
Acacio, Manuel E. ; Gonzalez, Jose ; García, José M. ; Duato, José
Author_Institution
Univ. de Murcia, Spain
fYear
2002
fDate
2002
Firstpage
155
Lastpage
164
Abstract
This work is focused on accelerating upgrade misses in cc-NUMA multiprocessors. These misses are caused by store instructions for which a read-only copy of the line is found in the L2 cache. Upgrade misses require a message sent from the missing node to the directory, a directory lookup in order to find the set of sharers, invalidation messages being sent to the sharers and responses to the invalidations being sent back. Therefore, the penalty paid by these misses is not negligible, mainly if we consider that they account for a high percentage of the total miss rate. We propose the use of prediction as a means of providing cc-NUMA multiprocessors with a more efficient support for upgrade misses by directly invalidating sharers from the missing node. Our proposal comprises an effective prediction scheme achieving high hit rates as well as a coherence protocol extended to support the use of prediction. Our work is motivated by two key observations: first, upgrade misses present a repetitive behavior and, second, the total number of sharers being invalidated is small (one, in some cases). Using execution-driven simulations, we show that the use of prediction can significantly accelerate upgrade misses (latency reductions of more than 40% in some cases). These important improvements translate into speed-ups on application performance up to 14%. Finally, these results can be obtained including a predictor with a total size of less than 48 KB in every node.
Keywords
cache storage; delays; memory protocols; shared memory systems; L2 cache; cc-NUMA multiprocessors; coherence protocol; direct invalidation; execution-driven simulations; latency reductions; prediction; repetitive behavior; sharers; upgrade miss acceleration; Acceleration; Access protocols; Coherence; Delay; Hardware; Out of order; Predictive models; Proposals;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel Architectures and Compilation Techniques, 2002. Proceedings. 2002 International Conference on
ISSN
1089-795X
Print_ISBN
0-7695-1620-3
Type
conf
DOI
10.1109/PACT.2002.1106014
Filename
1106014
Link To Document