Title :
A universal FPGA-based floating-point matrix processor for mobile systems
Author :
Wenqiang Wang ; Kaiyuan Guo ; Mengyuan Gu ; Yuchun Ma ; Yu Wang
Author_Institution :
Electron. Eng. Dept., Tsinghua Univ., Beijing, China
Abstract :
FPGA-based acceleration of matrix operations is a promising solution in mobile systems. However, most related work focuses on a certain operation instead of a complete system. In this paper, we explore the possibility of integrating multiple matrix accelerators with a master processor and propose a universal floating-point matrix processor. The processor supports multiple matrix-matrix operations (Level 3 BLAS) and the matrix size is unlimited. The key component of the processor is a shared matrix cache which enables on-chip communication between different accelerators. This structure reduces the external memory bandwidth requirement and improves the overall performance. Considering the performance of the whole system, an asynchronous instruction execution mechanism is further proposed in the hardware-software interface so as to reduce the workload of the master processor. We demonstrate the system using a DE3 develop board and achieve a computing performance of about 19 GFLOPS. Experiments show the proposed processor achieves higher performance and energy efficiency than some state-of-the-art embedded processors including ARM cortex A9 and NIOS Il/f soft-core processor. The performance of the processor is even comparable to some desktop processors.
Keywords :
field programmable gate arrays; floating point arithmetic; matrix algebra; ARM cortex A9; DE3 develop board; GFLOPS; NIOS II/f soft-core processor; desktop processor; energy efficiency; field programmmable gate array; hardware-software interface; master processor; matrix accelerator; matrix cache; mobile system; multiple matrix-matrix operation; on-chip communication; universal FPGA-based floating-point matrix processor; Arrays; Hardware; Mobile communication; Ports (Computers); Random access memory; Sparse matrices; Vector processors;
Conference_Titel :
Field-Programmable Technology (FPT), 2014 International Conference on
Print_ISBN :
978-1-4799-6244-0
DOI :
10.1109/FPT.2014.7082766