DocumentCode
632853
Title
Performance drawbacks for matrix multiplication using set associative cache in GPU devices
Author
Djinevski, Leonid ; Arsenovski, Sime ; Ristov, Sasko ; Gusev, Marjan
Author_Institution
FON Univ., Skopje, Macedonia
fYear
2013
fDate
20-24 May 2013
Firstpage
193
Lastpage
198
Abstract
Performance of shared memory processors show negative performance impulses (drawbacks) in certain regions for execution of the basic matrix multiplication algorithm. In this paper we continue with analysis of GPU memory hierarchy and corresponding cache memory organization. We give a theoretical analysis why a negative performance impulse appears for specifics problem sizes. The main reason is the cache storage organization, i.e. the negative performance peak appears caused by mapping of matrix elements onto one cache set, instead of using the whole cache. The obtained experimental results prove our theoretical analysis. We also propose a method to avoid situations where performance drawbacks appear.
Keywords
cache storage; content-addressable storage; graphics processing units; mathematics computing; matrix multiplication; memory architecture; performance evaluation; shared memory systems; GPU devices; GPU memory hierarchy; cache memory organization; cache storage organization; matrix element mapping; matrix multiplication algorithm; negative performance impulses; performance drawbacks; set associative cache; shared memory processor performance; Algorithm design and analysis; Cache memory; Computer architecture; Graphics processing units; Instruction sets; Organizations; Performance evaluation; Cache Memory; GPGPU; SIMD;
fLanguage
English
Publisher
ieee
Conference_Titel
Information & Communication Technology Electronics & Microelectronics (MIPRO), 2013 36th International Convention on
Conference_Location
Opatija
Print_ISBN
978-953-233-076-2
Type
conf
Filename
6596250
Link To Document