• DocumentCode
    2280249
  • Title

    Sunder: a programmable hardware prefetch architecture for numerical loops

  • Author

    Chiueh, Tzi-cker

  • Author_Institution
    Dept. of Comput. Sci., State Univ. of New York, Stony Brook, NY, USA
  • fYear
    1994
  • fDate
    14-18 Nov 1994
  • Firstpage
    488
  • Lastpage
    497
  • Abstract
    Beyond data caching, data prefetching is by far the most effective way to address the memory access bottleneck associated with high-performance processors. This is particularly true for scientific programs whose working sets cannot be easily fit into the on-chip data cache. This paper proposes a new data prefetching architecture called Sunder, which combines the flexibility and accurateness of software prefetching and the transparency and low-overhead of hardware prefetching. The heart of the design is a dedicated prefetch engine that is programmable at run time by the software. An important design decision is to keep the prefetch engine completely isolated from the normal instruction execution pipeline except a loop counter to keep the two synchronized at the boundaries of loop iterations. A detailed simulation study on the Sunder architecture shows that compared to the cache-only architecture, Sunder achieves an average relative performance advantage over cache-only architectures ranging from 28% to 46%, with smaller cache block sizes leading to greater performance improvement
  • Keywords
    cache storage; data handling; memory architecture; performance evaluation; Sunder; cache block sizes; cache-only architecture; cache-only architectures; data caching; data prefetching; dedicated prefetch engine; hardware prefetching; high-performance processors; instruction execution pipeline; loop counter; loop iterations; low-overhead; memory access bottleneck; numerical loops; on-chip data cache; performance advantage; programmable hardware prefetch architecture; scientific programs; simulation study; software prefetching; Clocks; Computer architecture; Computer science; Counting circuits; Hardware; Heart; Microprocessors; Parallel processing; Prefetching; Teleprinting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Supercomputing '94., Proceedings
  • Conference_Location
    Washington, DC
  • Print_ISBN
    0-8186-6605-6
  • Type

    conf

  • DOI
    10.1109/SUPERC.1994.344312
  • Filename
    344312