DocumentCode
2135778
Title
Heterogeneous Multi-core Parallel SGEMM Performance Testing and Analysis on Cell/B.E Processor
Author
Li, Yan ; Zhang, Yunquan ; Wang, Ke ; Guan, Wenhua
Author_Institution
Lab. of Parallel Software & Comput. Sci., ISCAS, Beijing, China
fYear
2010
fDate
15-17 July 2010
Firstpage
202
Lastpage
207
Abstract
Matrix multiplication is one of the most common numerical operations in the field of scientific computing, which is the kernel routine of Level 3 BLAS. The STI CELL processor is a heterogeneous multiprocessor with a unique design to achieve high peak floating point performance. As matrix multiplication operation is essential for a wide range of numerical algorithms, so performance improvements to the GEMM routine immediately can benefit the entire algorithm. In this paper, we provide a new way to utilize the hardware features of Cell to achieve better performance on the Single Precision General Matrix Multiplication (SGEMM), through both heterogeneous PPEs and SPEs parallelization, our method gains speedup over the Cell SDK (2.5%). An extra speedup about 30% of performance is achieved via interleaved memory allocation, which improves memory access.
Keywords
matrix multiplication; multiprocessing systems; parallel processing; performance evaluation; STI CELL processor; cell/B.E processor; heterogeneous multicore parallel SGEMM performance testing; heterogeneous multiprocessor; interleaved memory allocation; level 3 BLAS; single precision general matrix multiplication; Blades; Broadband communication; Computer architecture; Instruction sets; Kernel; Microprocessors; Registers; Cell; Heterogeneous; Matrix multiplication; Optimization; Parallel; multi-core;
fLanguage
English
Publisher
ieee
Conference_Titel
Networking, Architecture and Storage (NAS), 2010 IEEE Fifth International Conference on
Conference_Location
Macau
Print_ISBN
978-1-4244-8133-0
Type
conf
DOI
10.1109/NAS.2010.48
Filename
5575652
Link To Document