• DocumentCode
    2579009
  • Title

    Temporal instruction fetch streaming

  • Author

    Ferdman, Michael ; Wenisch, Thomas F. ; Ailamaki, Anastasia ; Falsafi, Babak ; Moshovos, Andreas

  • Author_Institution
    Comput. Archit. Lab., Carnegie Mellon Univ., Pittsburgh, PA
  • fYear
    2008
  • fDate
    8-12 Nov. 2008
  • Firstpage
    1
  • Lastpage
    10
  • Abstract
    L1 instruction-cache misses pose a critical performance bottleneck in commercial server workloads. Cache access latency constraints preclude L1 instruction caches large enough to capture the application, library, and OS instruction working sets of these workloads. To cope with capacity constraints, researchers have proposed instruction prefetchers that use branch predictors to explore future control flow. However, such prefetchers suffer from several fundamental flaws: their lookahead is limited by branch prediction bandwidth, their accuracy suffers from geometrically-compounding branch misprediction probability, and they are ignorant of the cache contents, frequently predicting blocks already present in L1. Hence, L1 instruction misses remain a bottleneck. We propose temporal instruction fetch streaming (TIFS)-a mechanism for prefetching temporally-correlated instruction streams from lower-level caches. Rather than explore a programpsilas control flow graph, TIFS predicts future instruction-cache misses directly, through recording and replaying recurring L1 instruction miss sequences. In this paper, we first present an information-theoretic offline trace analysis of instruction-miss repetition to show that 94% of L1 instruction misses occur in long, recurring sequences. Then, we describe a practical mechanism to record these recurring sequences in the L2 cache and leverage them for instruction-cache prefetching. Our TIFS design requires less than 5% storage overhead over the baseline L2 cache and improves performance by 11% on average and 24% at best in a suite of commercial server workloads.
  • Keywords
    cache storage; program diagnostics; branch misprediction probability; branch prediction bandwidth; cache access latency constraint; capacity constraint; information-theoretic offline trace analysis; temporal instruction fetch streaming; Accuracy; Bandwidth; Cache storage; Delay; Flow graphs; Information analysis; Libraries; Prefetching; caching; fetch-directed; instruction streaming; prefetching;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Microarchitecture, 2008. MICRO-41. 2008 41st IEEE/ACM International Symposium on
  • Conference_Location
    Lake Como
  • ISSN
    1072-4451
  • Print_ISBN
    978-1-4244-2836-6
  • Electronic_ISBN
    1072-4451
  • Type

    conf

  • DOI
    10.1109/MICRO.2008.4771774
  • Filename
    4771774