DocumentCode
3320736
Title
Design of Low-Cost High-Performance Floating-Point Fused Multiply-Add with Reduced Power
Author
Qi, Zichu ; Guo, Qi ; Zhang, Ge ; Li, Xiangku ; Hu, Weiwu
fYear
2010
fDate
3-7 Jan. 2010
Firstpage
206
Lastpage
211
Abstract
This paper presents a floating-point fused multiply-add (FMA) unit with low-cost and low power techniques. To improve the performance, two single-precision operations can be performed concurrently with one double-precision datapath, which is very useful in multimedia and even scientific applications. Moreover, to reduce the additional area costs for supporting two single-precision operations in parallel, multiple double-precision units, i.e., the multiplier, shifter and adder, are reused as much as possible. A modified dual-path algorithm is proposed by classifying the exponent difference into three cases and implementing them with close and far paths, which can reduce latency and facilitate lowering power consumption by enabling only one of the two paths. In addition, in case of FADD instructions, the multiplier in the first stage is bypassed and kept in stable mode, which can significantly improve FADD instruction performance and lower power consumption. The overall FMA unit has a latency of 4 cycles while the FADD operation has 3 cycles. Each cycle has a time delay of about 0.66 ns in the ST 65 nm CMOS technology. Compared with the conventional double-precision FMA, about 13% delay is reduced and about 22% area is increased, which is acceptable since two single-precision results can be generated simultaneously.
Keywords
CMOS logic circuits; adders; floating point arithmetic; multiplying circuits; ST 65 nm CMOS technology; close paths; double-precision datapath; far paths; floating-point addition instructions; floating-point fused multiply-add unit; low power technique; low-cost technique; modified dual-path algorithm; multimedia; multiple double-precision units; scientific applications; single-precision operations; size 65 nm; CMOS technology; Computer architecture; Costs; Delay effects; Energy consumption; Hardware; Laboratories; Roundoff errors; Throughput; Very large scale integration; FMA; dual-path FMA; low-power design;
fLanguage
English
Publisher
ieee
Conference_Titel
VLSI Design, 2010. VLSID '10. 23rd International Conference on
Conference_Location
Bangalore
ISSN
1063-9667
Print_ISBN
978-1-4244-5541-6
Type
conf
DOI
10.1109/VLSI.Design.2010.41
Filename
5401295
Link To Document