مرکز منطقه ای اطلاع رساني علوم و فناوري - Implementation and Analysis of Block Dense Matrix Decomposition on Network-on-Chips

DocumentCode :

1827483

Title :

Implementation and Analysis of Block Dense Matrix Decomposition on Network-on-Chips

Author :

Xu, Thomas Canhao ; Pahikkala, Tapio ; Airola, Antti ; Liljeberg, Pasi ; Plosila, Juha ; Salakoski, Tapio ; Tenhunen, Hammu

Author_Institution :

Dept. of Inf. Technol., Univ. of Turku, Turku, Finland

fYear :

2012

fDate :

25-27 June 2012

Firstpage :

516

Lastpage :

523

Abstract :

The decomposition of a dense matrix into lower and upper triangular matrices is an important linear algebra kernel that used in scientific and engineering applications. To decompose large matrices efficiently, the matrix is divided into sub-matrices as blocks. The block matrix decomposition is introduced for parallel hardware platforms, e.g. supercomputers, multicore processors and GPUs. Recently, the Network-on-Chip (NoC) paradigm is proposed as a promising multicore architecture for future Chip Multiprocessors (CMPs) with hundreds or even thousands of cores. The communication bottleneck of traditional bus or crossbar based on-chip interconnect is alleviated in the NoC architecture. However, the implementation and analysis of parallel block matrix decomposition in a NoC platform has not been well addressed. We design an NoC platform based on state-of-the-art systems. A block matrix decomposition algorithm is implemented on the NoC platform. Evaluation results are presented using a cycle accurate full system simulator. We achieve parallel efficiency of 74.8% with a 64-node NoC, which outperforms other three multiprocessor systems (30.5%, 67% and 50% respectively). We also analyzed the impact of block size, cache behavior and network pressure of the platform.

Keywords :

matrix algebra; matrix decomposition; microprocessor chips; multiprocessing systems; multiprocessor interconnection networks; network-on-chip; parallel architectures; CMP; NoC architecture; block dense matrix decomposition; chip multiprocessor; crossbar based on-chip interconnection; linear algebra kernel; multicore architecture; network-on-chip; parallel efficiency; parallel hardware platform; triangular matrix; Conferences; High performance computing;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems (HPCC-ICESS), 2012 IEEE 14th International Conference on

Conference_Location :

Liverpool

Print_ISBN :

978-1-4673-2164-8

Type :

conf

DOI :

10.1109/HPCC.2012.76

Filename :

6332215

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1827483