Title :
Eliminating squashes through learning cross-thread violations in speculative parallelization for multiprocessors
Author :
Cintra, Marcelo ; Torrellas, Josep
Author_Institution :
Div. of Informatics, Edinburgh Univ., UK
Abstract :
With speculative thread-level parallelization, codes that cannot be fully compiler-analyzed are aggressively executed in parallel. If the hardware detects a cross-thread dependence violation, it squashes offending threads and resumes execution. Unfortunately, frequent squashing cripples performance. This paper proposes a new framework of hardware mechanisms to eliminate most squashes due to data dependences in multiprocessors. The framework works by learning and predicting violations, and applying delayed-disambiguation, value prediction, and stall and release. The framework is suited for directory-based multiprocessors that track memory accesses at the system level with the coarse granularity of memory lines. Simulations of a 16-processor machine show that the framework is very effective. By adding our framework to a speculative CC-NUMA with 64-byte memory lines, we speed-up applications by an average of 4.3 times. Moreover, the resulting system is even 23% faster than a machine that tracks memory accesses at the fine granularity of words-a sophisticated system that is not compatible with mainstream cache coherence protocols.
Keywords :
multiprocessing systems; protocols; synchronisation; 64-byte memory lines; CC-NUMA; coarse granularity; cross-thread violations; hardware mechanisms; multiprocessors; speculative parallelization; speculative thread-level parallelization; value prediction; Access protocols; Automatic control; Computer science; Data mining; Delay; Hardware; Informatics; Parallel processing; Resumes; Yarn;
Conference_Titel :
High-Performance Computer Architecture, 2002. Proceedings. Eighth International Symposium on
Print_ISBN :
0-7695-1525-8
DOI :
10.1109/HPCA.2002.995697