DocumentCode
2320446
Title
Scalable FPGA-array for high-performance and power-efficient computation based on difference schemes
Author
Sano, Kentaro ; Luzhou Wang ; Hatsuda, Yoshiaki ; Yamamoto, Satoru
Author_Institution
Grad. Sch. of Inf. Sci., Tohoku Univ., Sendai
fYear
2008
fDate
16-16 Nov. 2008
Firstpage
1
Lastpage
9
Abstract
For numerical computations requiring a relatively high ratio of data access to operation, the scalability of memory bandwidth is key to performance improvement. In this paper, we propose a scalable FPGA-array to achieve custom computing machines for high-performance and power-efficient scientific simulations based on difference schemes. With the FPGA-array, we construct a systolic computational-memory array (SCMA) by homogeneously partitioning the SCMA among multiple tightly-coupled FPGAs. A large SCMA implemented using a lot of FPGAs achieves high-performance computation with scalable memory-bandwidth and scalable arithmetic-performance according to the array size. For feasibility demonstration and quantitative evaluation, we design and implement the SCMA of 192 processing elements over two ALTERA StratixII FPGAs. The implemented SCMA running at 106 MHz achieves the sustained performances of 32.8 to 36.5 GFlops in single precision for three benchmark computations while the peak performance is 40.7 GFlops. In comparison with a 3.4GHz Pentium4 processor, the SCMAs consume 70% to 87% power and require only 3% to 7% energy consumption for the same computations. Based on the requirement model for inter-FPGA bandwidth, we illustrate that SCMAs are completely scalable for the currently available high-end to low-end FPGAs, while the SCMA implemented with the two FPGAs demonstrates the doubled performance of that by the single-FPGA SCMA.
Keywords
arithmetic; field programmable gate arrays; finite difference methods; mathematics computing; systolic arrays; difference schemes; power-efficient computation; power-efficient scientific simulations; scalable FPGA-array; scalable arithmetic-performance; scalable memory-bandwidth; systolic computational-memory array; Application software; Bandwidth; Computational fluid dynamics; Computational modeling; Energy consumption; Field programmable gate arrays; High performance computing; Microprocessors; Partial differential equations; Scalability;
fLanguage
English
Publisher
ieee
Conference_Titel
High-Performance Reconfigurable Computing Technology and Applications, 2008. HPRCTA 2008. Second International Workshop on
Conference_Location
Austin, TX
Print_ISBN
978-1-4244-2826-7
Type
conf
DOI
10.1109/HPRCTA.2008.4745679
Filename
4745679
Link To Document