• DocumentCode
    2872296
  • Title

    Speculative data-driven multithreading

  • Author

    Roth, Amir ; Sohi, Gurindar S.

  • Author_Institution
    Dept. of Comput. Sci., Wisconsin Univ., Madison, WI, USA
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    37
  • Lastpage
    48
  • Abstract
    Mispredicted branches and loads that miss in the cache cause the majority of retirement stalls experienced by sequential processors; we call these critical instructions. Despite their importance, a sequential processor has difficulty prioritizing critical computations (computations of critical instructions), because it must fetch all computations sequentially, regardless of their contribution to performance. Speculative data-driven multithreading (DDMT) is a general-purpose mechanism for overcoming this limitation. In DDAT critical computations are annotated so that they can execute standalone. When the processor predicts an upcoming instance of a critical instruction, it microarchiturally forks a copy of its computation as a new kind of speculative thread: a data-driven thread (DDT). The DDT executes in parallel with the main program thread, but typically generates the critical result much faster since it fetches and executes only the critical computation and not the whole program. A DDT “pre-executes” a critical computation and effectively “consumes” its latency on behalf of the main thread. A DDMT component called integration incorporates results completed in DDTs directly, into the main thread, sparing it from having to repent the work. We simulate an implementation of DDMT on top of a simultaneous multithreading (SMT) processor and use program profiles to create DDTs and annotate them into the executable. Our experiments show that DDMT pre-execution of critical loads and branches can improve performance significantly
  • Keywords
    multi-threading; parallel architectures; data-driven multithreading; multithreading; sequential processor; simultaneous multithreading; Computational modeling; Computer aided instruction; Concurrent computing; Delay; Microarchitecture; Multithreading; Retirement; Surface-mount technology; Throughput; Yarn;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High-Performance Computer Architecture, 2001. HPCA. The Seventh International Symposium on
  • Conference_Location
    Monterrey
  • ISSN
    1530-0897
  • Print_ISBN
    0-7695-1019-1
  • Type

    conf

  • DOI
    10.1109/HPCA.2001.903250
  • Filename
    903250