• DocumentCode
    632853
  • Title

    Performance drawbacks for matrix multiplication using set associative cache in GPU devices

  • Author

    Djinevski, Leonid ; Arsenovski, Sime ; Ristov, Sasko ; Gusev, Marjan

  • Author_Institution
    FON Univ., Skopje, Macedonia
  • fYear
    2013
  • fDate
    20-24 May 2013
  • Firstpage
    193
  • Lastpage
    198
  • Abstract
    Performance of shared memory processors show negative performance impulses (drawbacks) in certain regions for execution of the basic matrix multiplication algorithm. In this paper we continue with analysis of GPU memory hierarchy and corresponding cache memory organization. We give a theoretical analysis why a negative performance impulse appears for specifics problem sizes. The main reason is the cache storage organization, i.e. the negative performance peak appears caused by mapping of matrix elements onto one cache set, instead of using the whole cache. The obtained experimental results prove our theoretical analysis. We also propose a method to avoid situations where performance drawbacks appear.
  • Keywords
    cache storage; content-addressable storage; graphics processing units; mathematics computing; matrix multiplication; memory architecture; performance evaluation; shared memory systems; GPU devices; GPU memory hierarchy; cache memory organization; cache storage organization; matrix element mapping; matrix multiplication algorithm; negative performance impulses; performance drawbacks; set associative cache; shared memory processor performance; Algorithm design and analysis; Cache memory; Computer architecture; Graphics processing units; Instruction sets; Organizations; Performance evaluation; Cache Memory; GPGPU; SIMD;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information & Communication Technology Electronics & Microelectronics (MIPRO), 2013 36th International Convention on
  • Conference_Location
    Opatija
  • Print_ISBN
    978-953-233-076-2
  • Type

    conf

  • Filename
    6596250