DocumentCode
598602
Title
Communication-Avoiding Parallel Strassen: Implementation and performance
Author
Lipshitz, Benjamin ; Ballard, Grey ; Demmel, J. ; Schwartz, Ofer
Author_Institution
Math. Dept., UC Berkeley, Berkeley, CA, USA
fYear
2012
fDate
10-16 Nov. 2012
Firstpage
1
Lastpage
11
Abstract
Matrix multiplication is a fundamental kernel of many high performance and scientific computing applications. Most parallel implementations use classical O(n3) matrix multiplication, even though there exist algorithms with lower arithmetic complexity. We recently presented a new Communication-Avoiding Parallel Strassen algorithm (CAPS), based on Strassen´s fast matrix multiplication, that minimizes communication (SPAA´12). It communicates asymptotically less than all classical and all previous Strassen-based algorithms, and it attains theoretical lower bounds. In this paper we show that CAPS is also faster in practice. We benchmark and compare its performance to previous algorithms on Hopper (Cray XE6), Intrepid (IBM BG/P), and Franklin (Cray XT4). We demonstrate significant speedups over previous algorithms both for large matrices and for small matrices on large numbers of processors. We model and analyze the performance of CAPS and predict its performance on future exascale platforms.
Keywords
computational complexity; digital arithmetic; matrix multiplication; CAPS; Franklin; Hopper; Intrepid; arithmetic complexity; communication-avoiding parallel strassen; exascale platforms; fast matrix multiplication; high performance computing applications; scientific computing applications; Bandwidth; Benchmark testing; Instruction sets; Parallel processing; Prediction algorithms; Random access memory;
fLanguage
English
Publisher
ieee
Conference_Titel
High Performance Computing, Networking, Storage and Analysis (SC), 2012 International Conference for
Conference_Location
Salt Lake City, UT
ISSN
2167-4329
Print_ISBN
978-1-4673-0805-2
Type
conf
DOI
10.1109/SC.2012.33
Filename
6468502
Link To Document