• DocumentCode
    3416284
  • Title

    Design of throughput-optimized arrays from recurrence abstractions

  • Author

    Jacob, Arpith C. ; Buhler, Jeremy D. ; Chamberlain, Roger D.

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Washington Univ. in St. Louis, St. Louis, MO, USA
  • fYear
    2010
  • fDate
    7-9 July 2010
  • Firstpage
    133
  • Lastpage
    140
  • Abstract
    Many compute-bound applications have seen order-of-magnitude speedups using special-purpose accelerators. FPGAs in particular are good at implementing recurrence equations realized as arrays. Existing high-level synthesis approaches for recurrence equations produce an array that is latency-space optimal. We target applications that operate on a large collection of small inputs, e.g. a database of biological sequences, where overall throughput is the most important measure of performance. In this work, we introduce a new design-space exploration procedure within the polyhedral framework to optimize throughput of a systolic array subject to area and bandwidth constraints of an FPGA device. Our approach is to exploit additional parallelism by pipelining multiple inputs on an array and multiple iteration vectors in a processing element. We prove that the throughput of an array is given by the inverse of the maximum number of iteration vectors executed by any processor in the array, which is determined solely by the array´s projection vector. We have applied this observation to discover novel arrays for Nussinov RNA folding. Our throughput-optimized array is 2× faster than the standard latency-space optimal array, yet it uses 15% fewer LUT resources. We achieve a further 2× speedup by processor pipelining, with only a 37% increase in resources. Our tool suggests additional arrays that trade area for throughput and are 4–5× faster than the currently used latency-optimized array. These novel arrays are 70–172× faster than a software baseline.
  • Keywords
    Computer applications; Constraint optimization; Databases; Design optimization; Difference equations; Field programmable gate arrays; High level synthesis; Pipeline processing; Systolic arrays; Throughput; Dynamic Programming; FPGA; Recurrences; Systolic Array; Throughput Optimization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Application-specific Systems Architectures and Processors (ASAP), 2010 21st IEEE International Conference on
  • Conference_Location
    Rennes, France
  • ISSN
    2160-0511
  • Print_ISBN
    978-1-4244-6966-6
  • Electronic_ISBN
    2160-0511
  • Type

    conf

  • DOI
    10.1109/ASAP.2010.5540753
  • Filename
    5540753