• DocumentCode
    263648
  • Title

    Global Load Instruction Aggregation Based on Array Dimensions

  • Author

    Sumikawa, Yasunobu ; Takimoto, Munehiro

  • Author_Institution
    Dept. of Inf. Sci., Tokyo Univ. of Sci., Tokyo, Japan
  • fYear
    2014
  • fDate
    13-15 July 2014
  • Firstpage
    123
  • Lastpage
    129
  • Abstract
    Most modern processors have some cache memoriesthat are much faster than a main memory, and it isimportant to utilize them effectively for efficient programexecution. The cache memories function well if temporal or spatial localities in the program are enhanced. Therefore, the cache efficiency can be improved by making accesses to the same array continuous. In addition, a multidimensional array can be regarded as an array of lower dimensional arrays, which means that it is more effective to continuously aggregate the array references with same indexes more in thehighest dimensions, even if they are not completely same. Wepropose a new cache optimization technique for improvingcache efficiency based on global code motion. Our techniquemoves a load instruction to immediately after the precedingload instructions accessing the same array with the most similarindexes, and then delays it as late as possible without changingthe access order. These two-step code motions contribute tonot only the improvement of the cache efficiency in the entireprogram but also the suppression of register pressure. Wehave implemented our technique in a real compiler and haveevaluated it for SPEC benchmarks. The experimental resultsshow that our technique can decrease cache misses by about99.9% in the best case.
  • Keywords
    cache storage; program compilers; SPEC benchmarks; array dimensions; array reference aggregation; cache efficiency; cache memories function; cache optimization technique; compiler; global code motion; global load instruction aggregation; lower dimensional arrays; multidimensional array; program execution; register pressure suppression; spatial localities; temporal localities; two-step code motions; Aggregates; Arrays; Cache memory; Equations; Indexes; Program processors; Registers; cache memory; code motion; data-flow analysis; partial redundancy elimination; register spill;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Architectures, Algorithms and Programming (PAAP), 2014 Sixth International Symposium on
  • Conference_Location
    Beijing
  • ISSN
    2168-3034
  • Print_ISBN
    978-1-4799-3844-5
  • Type

    conf

  • DOI
    10.1109/PAAP.2014.43
  • Filename
    6916449