DocumentCode
602624
Title
How to implement effective prediction and forwarding for fusable dynamic multicore architectures
Author
Robatmili, B. ; Dong Li ; Esmaeilzadeh, H. ; Govindan, S. ; Smith, A. ; Putnam, A. ; Burger, Danilo ; Keckler, Stephen W.
Author_Institution
Qualcomm Res. Silicon Valley, CA, USA
fYear
2013
fDate
23-27 Feb. 2013
Firstpage
460
Lastpage
471
Abstract
Dynamic multicore architectures, that fuse and split cores at run time, potentially offer a level of performance/energy agility that static multicore designs cannot achieve. Conventional ISAs, however, have scalability limits to fusion. EDGE-based designs offer greater scalability but to date have been performance limited by significant microarchitectural bottlenecks. This paper addresses these issues and makes three major contributions. First, it proposes Iterative Path Prediction to address low next block prediction accuracy and low speculation rates. It achieves close to taken/not-taken prediction accuracy for multi-exit instruction blocks while also speculating the predicated execution path within the block. Second, the paper proposes Exposed Operand Broadcasts to address the overhead of operand delivery for high fanout instructions by exposing a small number of broadcast operands in the ISA. Third, we present a scalable composable architecture called T3 that uses these mechanisms and show it can operate across a wide range of power and performance spectrum by increasing energy efficiency and performance significantly. Compared to previous EDGE designs, T3 improves energy efficiency by about 2x and performance by up to 50%.
Keywords
energy conservation; instruction sets; integrated circuit design; iterative methods; multiprocessing systems; performance evaluation; power aware computing; EDGE-based designs; ISAs; T3 architecture; broadcast operands; core fusion; core splitting; energy efficiency; execution path; exposed operand broadcasts; fusable dynamic multicore architectures; iterative path prediction; low speculation rates; microarchitectural bottlenecks; multiexit instruction blocks; operand delivery; performance spectrum; performance-energy agility; power spectrum; prediction accuracy; scalable composable architecture; static multicore designs; Accuracy; History; Microarchitecture; Multicore processing; Out of order; Registers;
fLanguage
English
Publisher
ieee
Conference_Titel
High Performance Computer Architecture (HPCA2013), 2013 IEEE 19th International Symposium on
Conference_Location
Shenzhen
ISSN
1530-0897
Print_ISBN
978-1-4673-5585-8
Type
conf
DOI
10.1109/HPCA.2013.6522341
Filename
6522341
Link To Document