Title :
Design of Low-Cost High-Performance Floating-Point Fused Multiply-Add with Reduced Power
Author :
Qi, Zichu ; Guo, Qi ; Zhang, Ge ; Li, Xiangku ; Hu, Weiwu
Abstract :
This paper presents a floating-point fused multiply-add (FMA) unit with low-cost and low power techniques. To improve the performance, two single-precision operations can be performed concurrently with one double-precision datapath, which is very useful in multimedia and even scientific applications. Moreover, to reduce the additional area costs for supporting two single-precision operations in parallel, multiple double-precision units, i.e., the multiplier, shifter and adder, are reused as much as possible. A modified dual-path algorithm is proposed by classifying the exponent difference into three cases and implementing them with close and far paths, which can reduce latency and facilitate lowering power consumption by enabling only one of the two paths. In addition, in case of FADD instructions, the multiplier in the first stage is bypassed and kept in stable mode, which can significantly improve FADD instruction performance and lower power consumption. The overall FMA unit has a latency of 4 cycles while the FADD operation has 3 cycles. Each cycle has a time delay of about 0.66 ns in the ST 65 nm CMOS technology. Compared with the conventional double-precision FMA, about 13% delay is reduced and about 22% area is increased, which is acceptable since two single-precision results can be generated simultaneously.
Keywords :
CMOS logic circuits; adders; floating point arithmetic; multiplying circuits; ST 65 nm CMOS technology; close paths; double-precision datapath; far paths; floating-point addition instructions; floating-point fused multiply-add unit; low power technique; low-cost technique; modified dual-path algorithm; multimedia; multiple double-precision units; scientific applications; single-precision operations; size 65 nm; CMOS technology; Computer architecture; Costs; Delay effects; Energy consumption; Hardware; Laboratories; Roundoff errors; Throughput; Very large scale integration; FMA; dual-path FMA; low-power design;
Conference_Titel :
VLSI Design, 2010. VLSID '10. 23rd International Conference on
Conference_Location :
Bangalore
Print_ISBN :
978-1-4244-5541-6
DOI :
10.1109/VLSI.Design.2010.41