DocumentCode :
3229122
Title :
Performance modeling and optimal block size selection for a BLAS-3 based tridiagonalization algorithm
Author :
Yamamoto, Yusaku
Author_Institution :
Dept. of Computational Sci. & Eng., Nagoya Univ.
fYear :
2005
fDate :
1-1 July 2005
Lastpage :
256
Abstract :
We construct a performance model for Bischof & Wu´s tridiagonalization algorithm that is fully based on the level-3 BLAS. The model has a hierarchical structure, which reflects the hierarchical structure of the original algorithm, and given the matrix size, the two block sizes and the performance data of the underlying BLAS routines, predicts the execution time of the algorithm. Experiments on the Opteron and Alpha 21264A processors show that the model is quite accurate and can predict the performance of the algorithm for matrix sizes from 1920 to 7680 and for various block sizes with relative errors below 10%. The model will serve as a key component of an automatic tuned library that selects the optimal block sizes itself. It can also be used in a grid environment to help the user find which of the available machines to use to solve his/her problem in the shortest time
Keywords :
grid computing; matrix decomposition; parallel machines; performance evaluation; software libraries; Alpha 21264A processors; BLAS-3 based tridiagonalization algorithm; Bischof & Wu tridiagonalization algorithm; Opteron; automatic tuned library; block size selection; grid environment; performance modeling; Bandwidth; Eigenvalues and eigenfunctions; Libraries; Linear algebra; Microprocessors; Predictive models;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High-Performance Computing in Asia-Pacific Region, 2005. Proceedings. Eighth International Conference on
Conference_Location :
Beijing
Print_ISBN :
0-7695-2486-9
Type :
conf
DOI :
10.1109/HPCASIA.2005.76
Filename :
1592276
Link To Document :
بازگشت