Title :
Parallel Sparse Matrix Multiplication for Preconditioning and SSTA on a Many-Core Architecture
Author :
Zhang, Keliang ; Wu, Baifeng
Author_Institution :
Sch. of Comput. Sci., Fudan Univ., Shanghai, China
Abstract :
Operations related to Sparse matrix multiplication are frequently used in scientific computing area, and these operations usually become a performance bottleneck because of their high operational complexity. For example, sparse matrix multiplying diagonal matrix (CS) is a key sub-procedure in preconditioning, and sparse matrix multiplying one-dimension block diagonal matrix (BCS) is a key sub-procedure for statistical static timing analysis (SSTA) without slope propagation based on sparse matrix framework. Although ELLH format along with its variant is suited to many-core architecture for spare matrix multiplying vector (SpMV) operation, for CS operation it leads to large amount of memory access due to accessing the column index matrix, for BCS operation it not only leads to larger amount of memory access, but also brings high computational complexity during parallel programming due to the complex data dependencies among matrix elements. This paper presents a new sparse format (named ELLV format). For CS operation, the number of memory access can be reduced by half because of no requirement of accessing the matrix for column index. Experiment result shows that with our ELLV format the performance of CS operation can be improved by 15% versus with ELLH format. For BCS operation, due to consistency of column index between the logical matrices and the physical matrices, not only the number of memory access can be reduced more remarkably, but also bring efficient and straightforward parallel programming on a many-core architecture.
Keywords :
computer architecture; matrix multiplication; multiprocessing systems; sparse matrices; statistical analysis; BCS; ELLH format; SSTA; SpMV operation; column index matrix; logical matrices; many-core architecture; memory access; one-dimension block diagonal matrix; parallel programming; parallel sparse matrix multiplication; physical matrices; preconditioning method; spare matrix multiplying vector; sparse matrix multiplying diagonal matrix; statistical static timing analysis; Delay; Indexes; Linear systems; Memory management; Parallel programming; Sparse matrices; Vectors; Preconditioning; sparse matrix multiplying diagonal matrix; sparse matrix multiplying one-dimension block diagonal matrix; statistical static timing analysis;
Conference_Titel :
Networking, Architecture and Storage (NAS), 2012 IEEE 7th International Conference on
Conference_Location :
Xiamen, Fujian
Print_ISBN :
978-1-4673-1889-1
DOI :
10.1109/NAS.2012.11