• DocumentCode
    3320736
  • Title

    Design of Low-Cost High-Performance Floating-Point Fused Multiply-Add with Reduced Power

  • Author

    Qi, Zichu ; Guo, Qi ; Zhang, Ge ; Li, Xiangku ; Hu, Weiwu

  • fYear
    2010
  • fDate
    3-7 Jan. 2010
  • Firstpage
    206
  • Lastpage
    211
  • Abstract
    This paper presents a floating-point fused multiply-add (FMA) unit with low-cost and low power techniques. To improve the performance, two single-precision operations can be performed concurrently with one double-precision datapath, which is very useful in multimedia and even scientific applications. Moreover, to reduce the additional area costs for supporting two single-precision operations in parallel, multiple double-precision units, i.e., the multiplier, shifter and adder, are reused as much as possible. A modified dual-path algorithm is proposed by classifying the exponent difference into three cases and implementing them with close and far paths, which can reduce latency and facilitate lowering power consumption by enabling only one of the two paths. In addition, in case of FADD instructions, the multiplier in the first stage is bypassed and kept in stable mode, which can significantly improve FADD instruction performance and lower power consumption. The overall FMA unit has a latency of 4 cycles while the FADD operation has 3 cycles. Each cycle has a time delay of about 0.66 ns in the ST 65 nm CMOS technology. Compared with the conventional double-precision FMA, about 13% delay is reduced and about 22% area is increased, which is acceptable since two single-precision results can be generated simultaneously.
  • Keywords
    CMOS logic circuits; adders; floating point arithmetic; multiplying circuits; ST 65 nm CMOS technology; close paths; double-precision datapath; far paths; floating-point addition instructions; floating-point fused multiply-add unit; low power technique; low-cost technique; modified dual-path algorithm; multimedia; multiple double-precision units; scientific applications; single-precision operations; size 65 nm; CMOS technology; Computer architecture; Costs; Delay effects; Energy consumption; Hardware; Laboratories; Roundoff errors; Throughput; Very large scale integration; FMA; dual-path FMA; low-power design;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    VLSI Design, 2010. VLSID '10. 23rd International Conference on
  • Conference_Location
    Bangalore
  • ISSN
    1063-9667
  • Print_ISBN
    978-1-4244-5541-6
  • Type

    conf

  • DOI
    10.1109/VLSI.Design.2010.41
  • Filename
    5401295