• DocumentCode
    1043365
  • Title

    Design tradeoff analysis of floating-point adders in FPGAs

  • Author

    Malik, A. ; Dongdong Chen ; Younhee Choi ; Moon Lee ; Seok-Bum Ko

  • Author_Institution
    Univ. of Saskatchewan, Saskatoon, SK
  • Volume
    33
  • Issue
    42433
  • fYear
    2008
  • Firstpage
    169
  • Lastpage
    175
  • Abstract
    With gate counts of ten million, field-programmable gate arrays (FPGAs) are becoming suitable for floating-point computations. Addition is the most complex operation in a floating-point unit and can cause major delay while requiring a significant area. Over the years, the VLSI community has developed many floating-point adder algorithms aimed primarily at reducing the overall latency. An efficient design of the floating-point adder offers major area and performance improvements for FPGAs. Given recent advances in FPGA architecture and area density, latency has become the main focus in attempts to improve performance. This paper studies the implementation of standard; leading-one predictor (LOP); and far and close datapath (2-path) floating-point addition algorithms in FPGAs. Each algorithm has complex sub-operations which contribute significantly to the overall latency of the design. Each of the sub-operations is researched for different implementations and is then synthesized onto a Xilinx Virtex-II Pro FPGA device. Standard and LOP algorithms are also pipelined into five stages and compared with the Xilinx IP. According to the results, the standard algorithm is the best implementation with respect to area, but has a large overall latency of 27.059 ns while occupying 541 slices. The LOP algorithm reduces latency by 6.5% at the cost of a 38% increase in area compared to the standard algorithm. The 2-path implementation shows a 19% reduction in latency with an added expense of 88% in area compared to the standard algorithm. The five-stage standard pipeline implementation shows a 6.4% improvement in clock speed compared to the Xilinx IP with a 23% smaller area requirement. The five-stage pipelined LOP implementation shows a 22% improvement in clock speed compared to the Xilinx IP at a cost of 15% more area.
  • Keywords
    VLSI; adders; field programmable gate arrays; floating point arithmetic; FPGA; VLSI community; Xilinx Virtex-II Pro FPGA device; complex operation; complex suboperations; design tradeoff analysis; field-programmable gate arrays; floating-point adders; leading-one predictor; Adders; Algorithm design and analysis; Clocks; Costs; Delay; Field programmable gate arrays; Gold; Moon; Pipelines; Very large scale integration; FPGA; floating-point adder;
  • fLanguage
    English
  • Journal_Title
    Electrical and Computer Engineering, Canadian Journal of
  • Publisher
    ieee
  • ISSN
    0840-8688
  • Type

    jour

  • DOI
    10.1109/CJECE.2008.4721634
  • Filename
    4721634