• DocumentCode
    3535405
  • Title

    Data parallel FPGA workloads: Software versus hardware

  • Author

    Yiannacouras, Peter ; Steffan, J. Gregory ; Rose, Jonathan

  • Author_Institution
    Edward S. Rogers Sr. Dept. of Electr. & Comput. Eng., Univ. of Toronto, Toronto, ON, Canada
  • fYear
    2009
  • fDate
    Aug. 31 2009-Sept. 2 2009
  • Firstpage
    51
  • Lastpage
    58
  • Abstract
    Commercial soft processors are unable to effectively exploit the data parallelism present in many embedded systems workloads, requiring FPGA designers to exploit it (laboriously) with manual hardware design. Recent research has demonstrated that soft processors augmented with support for vector instructions provide significant improvements in performance and scalability for data parallel workloads. These soft vector processors provide a software environment for quickly encoding data parallel computation, but their competitiveness with manual hardware design in terms of area and performance remains unknown. In this work, using an FPGA platform equipped with DDR memory executing data-parallel EEMBC embedded benchmarks, we measure the area/performance gaps between (i) a scalar soft processor, (ii) our improved soft vector processor, and (iii) custom FPGA hardware. We demonstrate that the 432times wall clock performance gap between scalar executed C and custom hardware can be reduced significantly to 17times using our improved soft vector processor, while silicon-efficiency is improved by 3times in terms of area delay product. We modified the architecture to mitigate three key advantages we observed in custom hardware: loop overhead, data delivery, and exact resource usage. Combined these improvements increase performance by 3times and reduce area by almost half, significantly reducing the need for designers to resort to more challenging custom hardware implementations.
  • Keywords
    C language; embedded systems; field programmable gate arrays; hardware-software codesign; microcomputers; parallel programming; program processors; C language; DDR memory; FPGA platform; data parallel computation; embedded systems workloads; soft vector processors; Area measurement; Concurrent computing; Embedded system; Encoding; Field programmable gate arrays; Hardware; Parallel processing; Scalability; Software performance; Vector processors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Field Programmable Logic and Applications, 2009. FPL 2009. International Conference on
  • Conference_Location
    Prague
  • ISSN
    1946-1488
  • Print_ISBN
    978-1-4244-3892-1
  • Electronic_ISBN
    1946-1488
  • Type

    conf

  • DOI
    10.1109/FPL.2009.5272551
  • Filename
    5272551