Title :
Achieving Scalable Parallelization for the Hessenberg Factorization
Author :
Castaldo, Anthony M. ; Whaley, R. Clint
Author_Institution :
Dept. of Comput. Sci., Univ. of Texas at San Antonio, San Antonio, TX, USA
Abstract :
Much of dense linear algebra has been successfully blocked to concentrate the majority of its time in the Level 3 BLAS, which are not only efficient for serial computation, but also scale well for parallelism. For the Hessenberg factorization, which is a critical step in computing the eigenvalues and vectors, however, performance of the best known algorithm is still strongly limited by the memory speed, which does not tend to scale well at all. In this paper we present an adaptation of our Parallel Cache Assignment (PCA) technique to the Hessenberg factorization, and show that it achieves super linear speedup over the corresponding serial algorithm and a more than four-fold speedup over the best known algorithm for small and medium sized problems.
Keywords :
cache storage; eigenvalues and eigenfunctions; linear algebra; parallel processing; Hessenberg factorization; PCA; achieving scalable parallelization; eigenvalues; linear algebra; memory speed; parallel cache assignment; Aggregates; Eigenvalues and eigenfunctions; Instruction sets; Linear algebra; Linux; Principal component analysis; ATLAS; Hessenberg; LAPACK; factorization; multi-core; multicore; parallel;
Conference_Titel :
Cluster Computing (CLUSTER), 2011 IEEE International Conference on
Conference_Location :
Austin, TX
Print_ISBN :
978-1-4577-1355-2
Electronic_ISBN :
978-0-7695-4516-5
DOI :
10.1109/CLUSTER.2011.16