DocumentCode :
3522475
Title :
BOLT: Energy-efficient Out-of-Order Latency-Tolerant execution
Author :
Hilton, Andrew ; Roth, Amir
Author_Institution :
Dept. of Comput. & Inf. Sci., Univ. of Pennsylvania, Philadelphia, PA, USA
fYear :
2010
fDate :
9-14 Jan. 2010
Firstpage :
1
Lastpage :
12
Abstract :
LT (latency tolerant) execution is an attractive candidate technique for future out-of-order cores. LT defers the forward slices of LLC (last-level cache) misses to a slice buffer and re-executes them when the misses return. An LT core increases ILP without physically scaling the issue queue and register file and increases MLP without additional software threads that can reduce cache performance. Unfortunately, proposed LT designs are not energy efficient. They require too many additional structures and they defer and re-execute too many instructions to justify their performance gains. In this paper, we address these inefficiencies. We introduce a microarchitecture called BOLT (Better Out-of-Order Latency-Tolerance) that implements LT as an alternative use of SMT (Simultaneous Multi-Threading). We also present a new slice buffer organization and traversal scheme that increases performance and reduces overhead by pruning instances of useless and redundant LT. Collectively, these modifications turn out-of-order LT into a technique that improves performance in an energy-efficient way.
Keywords :
cache storage; memory architecture; parallel architectures; parallel memories; performance evaluation; tolerance analysis; BOLT; ILP; LLC; MLP; better out-of-order latency-tolerance; cache performance; candidate technique; energy-efficient out-of-order latency-tolerant execution; forward slices; future out-of-order cores; instruction level parallelism; last-level cache; memory-level parallelism; microarchitecture; performance gains; redundant LT; register file; slice buffer; slice buffer organization; software threads; Delay; Energy efficiency; Fasteners; Information science; Microarchitecture; Out of order; Performance gain; Software performance; Surface-mount technology; Yarn;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computer Architecture (HPCA), 2010 IEEE 16th International Symposium on
Conference_Location :
Bangalore
ISSN :
1530-0897
Print_ISBN :
978-1-4244-5658-1
Type :
conf
DOI :
10.1109/HPCA.2010.5416634
Filename :
5416634
Link To Document :
بازگشت