Title :
Design of a massively parallel computing architecture for dense matrix multiplication
Author :
Jose, Wilson ; Silva, A.R. ; Vestias, Mario ; Neto, Horacio
Author_Institution :
INESC-ID, Lisbon, Portugal
fDate :
Feb. 27 2013-March 1 2013
Abstract :
Transistor density made possible the design of massively parallel architectures with hundreds of cores on a single chip. Designing architectures with such high number of cores with efficient performance/area or power ratios is very challenging. In this paper we take a different approach to design many-core architectures. We start with a formal analysis of the algorithms considering architectural aspects, and then decide the structure of the architecture. To exemplify the approach we did a theoretical analysis of a dense matrix multiplication algorithm, implemented the architecture based on the theoretical model and simulated the system in SystemC. Results indicate that the proposed architecture is near two orders of magnitude more performance/area efficient than a cutting-edge general-purpose processor achieving near 1 TFLOP in a 100 mm2 chip with 65 nm technology.
Keywords :
application specific integrated circuits; mathematics computing; matrix multiplication; multiprocessing systems; parallel architectures; performance evaluation; transistors; SystemC; TFLOP; architectural aspects; chip performance; cutting-edge general-purpose processor; dense matrix multiplication; dense matrix multiplication algorithm; formal analysis; many-core architecture design; massively parallel computing architecture design; power ratios; size 65 nm; transistor density; Algorithm design and analysis; Bandwidth; Computational modeling; Field programmable gate arrays; Microprocessors; Parallel architectures; ASIC; High-performance; Massively Parallel; Matrix Multiplication;
Conference_Titel :
Circuits and Systems (LASCAS), 2013 IEEE Fourth Latin American Symposium on
Conference_Location :
Cusco
Print_ISBN :
978-1-4673-4897-3
DOI :
10.1109/LASCAS.2013.6519064