DocumentCode :
2050254
Title :
The general matrix multiply-add operation on 2D torus
Author :
Zekri, Ahmed S. ; Sedukhin, Stanislav G.
Author_Institution :
Graduate Sch. of Comput. Sci. & Eng., Aizu Univ., Aizu-Wakamatsu
fYear :
2006
fDate :
25-29 April 2006
Abstract :
In this paper, the index space of the (n times n)-matrix multiply-add problem C = C + AmiddotB is represented as a 3D n times n times n torus. All possible time-scheduling functions to activate the computation and data rolling inside the 3D torus index space are determined. To maximize efficiency when solving a single problem, we mapped the computations into the 2D n times n toroidal array processor. All optimal 2D data allocations that solve the problem in n multiply-add-roll steps are obtained. The well known Cannon´s algorithm is one of the resulting allocations. We used the optimal data allocations to describe all variants of the GEMM operation on the 2D toroidal array processor. By controling the data movement, the transposition operation is avoided in 75% of the GEMM variants. However, only one matrix transpose is needed for the remaining 25%. Ultimately, we described four versions of the GEMM operation covering the possible layouts of the initially loaded data into the array processor
Keywords :
matrix multiplication; parallel processing; 2D data allocations; 2D toroidal array processor; 2D torus index space; 3D torus index space; Cannon algorithm; GEMM variants; data rolling; general matrix multiply-add operation; matrix multiply-add problem; matrix transpose; multiply-add-roll steps; optimal data allocations; time-scheduling functions; transposition operation; Broadcasting; Cities and towns; Computer architecture; Computer science; Concurrent computing; High performance computing; Image processing; Kernel; Linear algebra; Process control;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Symposium, 2006. IPDPS 2006. 20th International
Conference_Location :
Rhodes Island
Print_ISBN :
1-4244-0054-6
Type :
conf
DOI :
10.1109/IPDPS.2006.1639613
Filename :
1639613
Link To Document :
بازگشت