Title :
Heterogeneous Multi-core Parallel SGEMM Performance Testing and Analysis on Cell/B.E Processor
Author :
Li, Yan ; Zhang, Yunquan ; Wang, Ke ; Guan, Wenhua
Author_Institution :
Lab. of Parallel Software & Comput. Sci., ISCAS, Beijing, China
Abstract :
Matrix multiplication is one of the most common numerical operations in the field of scientific computing, which is the kernel routine of Level 3 BLAS. The STI CELL processor is a heterogeneous multiprocessor with a unique design to achieve high peak floating point performance. As matrix multiplication operation is essential for a wide range of numerical algorithms, so performance improvements to the GEMM routine immediately can benefit the entire algorithm. In this paper, we provide a new way to utilize the hardware features of Cell to achieve better performance on the Single Precision General Matrix Multiplication (SGEMM), through both heterogeneous PPEs and SPEs parallelization, our method gains speedup over the Cell SDK (2.5%). An extra speedup about 30% of performance is achieved via interleaved memory allocation, which improves memory access.
Keywords :
matrix multiplication; multiprocessing systems; parallel processing; performance evaluation; STI CELL processor; cell/B.E processor; heterogeneous multicore parallel SGEMM performance testing; heterogeneous multiprocessor; interleaved memory allocation; level 3 BLAS; single precision general matrix multiplication; Blades; Broadband communication; Computer architecture; Instruction sets; Kernel; Microprocessors; Registers; Cell; Heterogeneous; Matrix multiplication; Optimization; Parallel; multi-core;
Conference_Titel :
Networking, Architecture and Storage (NAS), 2010 IEEE Fifth International Conference on
Conference_Location :
Macau
Print_ISBN :
978-1-4244-8133-0
DOI :
10.1109/NAS.2010.48