DocumentCode
2483133
Title
Minimizing startup costs for performance-critical threading
Author
Castaldo, Anthony M. ; Whaley, R. Clint
Author_Institution
Dept. of Comput. Sci., Univ. of Texas at San Antonio, San Antonio, TX, USA
fYear
2009
fDate
23-29 May 2009
Firstpage
1
Lastpage
8
Abstract
Using the well-known ATLAS and LAPACK dense linear algebra libraries, we demonstrate that the parallel management overhead (PMO) can grow with problem size on even statically scheduled parallel programs with minimal task interaction. Therefore, the widely held view that these thread management issues can be ignored in such computationally intensive libraries is wrong, and leads to substantial slowdown on today´s machines. We survey several methods for reducing this overhead, the best of which we have not seen in the literature. Finally, we demonstrate that by applying these techniques at the kernel level, performance in applications such as LU and QR factorizations can be improved by almost 40% for small problems, and as much as 15% for large O(N3) computations. These techniques are completely general, and should yield significant speedup in almost any performance-critical operation.We then show that the lion´s share of the remaining parallel inefficiency comes from bus contention, and, in the future work section, outline some promising avenues for further improvement.
Keywords
parallel processing; performance evaluation; ATLAS dense linear algebra libraries; LAPACK dense linear algebra libraries; bus contention; minimal task interaction; parallel management overhead; performance-critical operation; performance-critical threading; scheduled parallel programs; startup cost minimization; Computer science; Costs; Kernel; Libraries; Linear algebra; Packaging; Parallel processing; Processor scheduling; Timing; Yarn;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on
Conference_Location
Rome
ISSN
1530-2075
Print_ISBN
978-1-4244-3751-1
Electronic_ISBN
1530-2075
Type
conf
DOI
10.1109/IPDPS.2009.5161010
Filename
5161010
Link To Document