مرکز منطقه ای اطلاع رساني علوم و فناوري - Performance of panel and block approaches to sparse Cholesky factorization on the iPSC/860 and Paragon multicomputers

DocumentCode :

1886209

Title :

Performance of panel and block approaches to sparse Cholesky factorization on the iPSC/860 and Paragon multicomputers

Author :

Rothberg, Edward

Author_Institution :

Intel Sci. Comput., Beaverton, OR, USA

fYear :

1994

fDate :

23-25 May 1994

Firstpage :

324

Lastpage :

333

Abstract :

Sparse Cholesky factorization has historically achieved extremely low performance on distributed memory multiprocessors. Three issues must be addressed to improve this situation: (1) parallel factorization methods must be based on more efficient sequential methods; (2) parallel machines must provide higher interprocessor communication bandwidth; and (3) the sparse matrices used to evaluate parallel sparse factorization performance should be more representative of the sizes of matrices people would factor on large parallel machines. All of these issues have in fact already been addressed. Specifically: (1) single-node performance can be improved by moving from a column-oriented approach, where the computational kernel is Level 1 BLAS, to either a panel- or block-oriented approach, where the kernel is Level 3 BLAS; (2) communication hardware has improved dramatically, with new parallel computers providing higher communication bandwidth than previous parallel computers; and (3) several larger benchmark matrices are now available, and newer parallel machines offer sufficient memory per node to factor these larger matrices. The result of addressing these three issues is extremely high performance on moderately parallel machines. This paper demonstrates performance levels of 650 double-precision MFLOPS on 32 processors of the Intel Paragon system, 1 GFLOPS on 64 processors, and 1.7 GFLOPS on 128 processors. This paper also does a direct performance comparison between the iPSC/860 and Paragon systems, as well as a comparison between panel- and block-oriented approaches to parallel factorization

Keywords :

mathematics computing; matrix algebra; parallel algorithms; parallel machines; performance evaluation; 1 GFLOPS; 1.7 GFLOPS; 650 MFLOPS; Intel Paragon multicomputer; Intel iPSC/860 multicomputer; Level 1 BLAS; Level 3 BLAS; benchmark matrices; block-oriented approach; column-oriented approach; communication bandwidth; communication hardware; computational kernel; distributed memory multiprocessors; efficient sequential methods; interprocessor communication bandwidth; matrix size; memory per node; panel-oriented approach; parallel factorization; parallel factorization methods; parallel machines; single-node performance; sparse Cholesky factorization; sparse matrices; Bandwidth; Concurrent computing; Finite element methods; Hardware; High performance computing; Kernel; Linear programming; Parallel machines; Sparse matrices; Supercomputers;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Scalable High-Performance Computing Conference, 1994., Proceedings of the

Conference_Location :

Knoxville, TN

Print_ISBN :

0-8186-5680-8

Type :

conf

DOI :

10.1109/SHPCC.1994.296661

Filename :

296661

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1886209