مرکز منطقه ای اطلاع رساني علوم و فناوري - The Ultrascalar processor-an asymptotically scalable superscalar microarchitecture

DocumentCode :

2689147

Title :

The Ultrascalar processor-an asymptotically scalable superscalar microarchitecture

Author :

Henry, Dana S. ; Kuszmaul, Bradley C. ; Viswanath, Vinod

Author_Institution :

Depts. of Comput. Sci. & Electr. Eng., Yale Univ., New Haven, CT, USA

fYear :

1999

fDate :

21-24 Mar 1999

Firstpage :

256

Lastpage :

273

Abstract :

The poor scalability of existing superscalar processors has been of great concern to the computer engineering community. In particular the critical-path lengths of many components in existing implementations grow as Θ(n²) where n is the fetch width, the issue width, or the window size. This paper presents a novel implementation, called the Ultrascalar processor, that dramatically reduces the asymptotic critical-path length of a superscalar processor. The processor is implemented by a large collection of ALUs with controllers (together called execution stations) connected together by a network of parallel-prefix tree circuits. A fat-tree network connects an interleaved cache to the execution stations. These networks provide the full functionality of superscalar processors including renaming, out-of-order execution, and speculative execution. The Ultrascalar´s critical-path length due to gate delays is τ_gates=Θ(log n). The wire delays and chip size depend on the provided memory bandwidth and the layout. If the provided memory bandwidth is M(n) memory operations per clock cycle then, using an H-tree VLSI layout, the critical-path length due to wire delay (speed-of-light delay) is τ_wires={Θ(n^1/2) if M(n) is O(n^1/2-ε) for ε>0, [optimal]; {Θ(n^1/2log n) if M(n) is Θ(n^1/2), [near optimal]; and {Θ(M(n)) if M(n) is Ω(n^1/2+ε) for ε>0, [optimal] (with M suitably constrained.) The area is the square of the wire delay

Keywords :

CMOS digital integrated circuits; VLSI; delay estimation; integrated circuit layout; microprocessor chips; parallel architectures; ALUs; H-tree VLSI layout; Ultrascalar processor; asymptotic critical-path length reduction; asymptotically scalable superscalar microarchitecture; controllers; execution stations; fat-tree network; gate delays; interleaved cache; memory bandwidth; out-of-order execution; parallel-prefix tree circuits; renaming; speculative execution; wire delay; Bandwidth; Circuits; Clocks; Computer science; Delay; Delay lines; Design optimization; Microarchitecture; Out of order; Parallel processing; Registers; Scalability; Very large scale integration; Wire;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Advanced Research in VLSI, 1999. Proceedings. 20th Anniversary Conference on

Conference_Location :

Atlanta, GA

ISSN :

1522-869X

Print_ISBN :

0-7695-0056-0

Type :

conf

DOI :

10.1109/ARVLSI.1999.756053

Filename :

756053

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2689147