• DocumentCode
    2423479
  • Title

    Low Overhead Instruction-Cache Modeling Using Instruction Reuse Profiles

  • Author

    Khan, Muneeb ; Sembrant, Andreas ; Hagersten, Erik

  • Author_Institution
    Dept. of Inf. Technol., Uppsala Univ., Uppsala, Sweden
  • fYear
    2012
  • fDate
    24-26 Oct. 2012
  • Firstpage
    260
  • Lastpage
    269
  • Abstract
    Performance loss caused by L1 instruction cache misses varies between different architectures and cache sizes. For processors employing power-efficient in-order execution with small caches, performance can be significantly affected by instruction cache misses. The growing use of low-power multi-threaded CPUs (with shared L1 caches) in general purpose computing platforms requires new efficient techniques for analyzing application instruction cache usage. Such insight can be achieved using traditional simulation technologies modeling several cache sizes, but the overhead of simulators may be prohibitive for practical optimization usage. In this paper we present a statistical method to quickly model application instruction cache performance. Most importantly we propose a very low-overhead sampling mechanism to collect runtime data from the application´s instruction stream. This data is fed to the statistical model which accurately estimates the instruction cache miss ratio for the sampled execution. Our sampling method is about 10x faster than previously suggested sampling approaches, with average runtime overhead as low as 25% over native execution. The architecturally-independent data collected is used to accurately model miss ratio for several cache sizes simultaneously, with average absolute error of 0.2%. Finally, we show how our tool can be used to identify program phases with large instruction cache footprint. Such phases can then be targeted to optimize for reduced code footprint.
  • Keywords
    cache storage; computer architecture; microcomputers; multi-threading; optimisation; statistical analysis; general purpose computing platforms; instruction cache footprint; instruction reuse profiles; low overhead instruction-cache modeling; low-power multi-threaded CPU; optimization; performance loss; power-efficient in-order execution; statistical method; Adaptation models; Computational modeling; Data models; Hardware; Program processors; Radiation detectors; Runtime;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Architecture and High Performance Computing (SBAC-PAD), 2012 IEEE 24th International Symposium on
  • Conference_Location
    New York, NY
  • ISSN
    1550-6533
  • Print_ISBN
    978-1-4673-4790-7
  • Type

    conf

  • DOI
    10.1109/SBAC-PAD.2012.25
  • Filename
    6374797