مرکز منطقه ای اطلاع رساني علوم و فناوري - Energy-Efficient Floating-Point Unit Design

DocumentCode :

1514477

Title :

Energy-Efficient Floating-Point Unit Design

Author :

Galal, Sameh ; Horowitz, Mark

Author_Institution :

Dept. of Electr. Eng., Stanford Univ., Stanford, CA, USA

Volume :

Issue :

fYear :

2011

fDate :

7/1/2011 12:00:00 AM

Firstpage :

913

Lastpage :

922

Abstract :

Energy-efficient computation is critical if we are going to continue to scale performance in power-limited systems. For floating-point applications that have large amounts of data parallelism, one should optimize the throughput/mm² given a power density constraint. We present a method for creating a trade-off curve that can be used to estimate the maximum floating-point performance given a set of area and power constraints. Looking at FP multiply-add units and ignoring register and memory overheads, we find that in a 90 nm CMOS technology at 1 W/mm², one can achieve a performance of 27 GFlops/mm² single precision, and 7.5 GFlops/mm double precision. Adding register file overheads reduces the throughput by less than 50 percent if the compute intensity is high. Since the energy of the basic gates is no longer scaling rapidly, to maintain constant power density with scaling requires moving the overall FP architecture to a lower energy/performance point. A 1 W/mm² design at 90 nm is a "high-energy" design, so scaling it to a lower energy design in 45 nm still yields a 7× performance gain, while a more balanced 0.1 W/mm² design only speeds up by 3.5× when scaled to 45 nm. Performance scaling below 45 nm rapidly decreases, with a projected improvement of only ~3x for both power densities when scaling to a 22 nm technology.

Keywords :

CMOS logic circuits; adders; floating point arithmetic; logic design; low-power electronics; multiplying circuits; nanoelectronics; power aware computing; CMOS technology; FP architecture; FP multiply-add units; area constraint; data parallelism; energy-efficient computation; energy-efficient floating-point unit design; floating-point application; floating-point performance; high-energy design; logic structure; power density constraint; power-limited system; register file overhead; trade-off curve; Computer architecture; Energy efficiency; Optimization; Pipeline processing; Registers; Threshold voltage; Throughput; Arithmetic and logic structures; floating point; fused multiply-add; high-speed arithmetic; throughput/{rm mm}^{2} optimization.;

fLanguage :

English

Journal_Title :

Computers, IEEE Transactions on

Publisher :

ieee

ISSN :

0018-9340

Type :

jour

DOI :

10.1109/TC.2010.121

Filename :

5483287

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1514477