• DocumentCode
    3543362
  • Title

    Locality-Aware Dynamic Mapping for Multithreaded Applications

  • Author

    Demiroz, Betul ; Topcuoglu, Haluk Rahmi ; Kandemir, Mahmut ; Tosun, Oguz

  • Author_Institution
    Comput. Eng. Dept., Marmara Univ., Istanbul, Turkey
  • fYear
    2012
  • fDate
    15-17 Feb. 2012
  • Firstpage
    185
  • Lastpage
    189
  • Abstract
    Locality analysis of an application helps us extract data access patterns and predict runtime cache behavior. In this paper, we propose a locality-aware dynamic mapping algorithm for multithreaded applications, which assigns computations with similar data access patterns to same cores. We collect the amounts of shared and distinct data used by all computations, called chunks and calculate sharing among those chunks. Then, chunks with the similar data access patterns are grouped into bins, which are subsequently assigned to threads for improving cache reuse and program performance. Our algorithm is illustrated with sparse matrix-vector multiply (SpMV), which is one of the most widely used kernel in engineering and scientific computing and suffers from irregular and indirect memory access patterns. Five inputs with different shapes and characteristics are considered for testing the performance of our algorithm. Based on the results of experimental study, our algorithm outperforms Linux scheduler with an average of 12.5% performance improvement for various scenarios considered.
  • Keywords
    Linux; cache storage; data analysis; multi-threading; processor scheduling; software performance evaluation; sparse matrices; storage management; vectors; Linux scheduler; SpMV; cache reuse; chunks; data access patterns; distinct data; indirect memory access patterns; irregular memory access patterns; locality analysis; locality-aware dynamic mapping algorithm; multithreaded applications; performance improvement; performance testing; program performance; runtime cache behavior; scientific computing; shared data; sparse matrix-vector multiply; Heuristic algorithms; Instruction sets; Linux; Runtime; Shape; Sparse matrices; Vectors; Chip Multiprocessors; cache behavior; dynamic mapping; multithreading;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel, Distributed and Network-Based Processing (PDP), 2012 20th Euromicro International Conference on
  • Conference_Location
    Garching
  • ISSN
    1066-6192
  • Print_ISBN
    978-1-4673-0226-5
  • Type

    conf

  • DOI
    10.1109/PDP.2012.84
  • Filename
    6169548