• DocumentCode
    2041427
  • Title

    Improving cache locality for thread-level speculation

  • Author

    Fung, Stanley L C ; Steffan, J. Gregory

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Toronto Univ., Ont.
  • fYear
    2006
  • fDate
    25-29 April 2006
  • Abstract
    With the advent of chip-multiprocessors (CMPs), thread-level speculation (TLS) remains a promising technique for exploiting this highly multithreaded hardware to improve the performance of an individual program. However, with such speculatively-parallel execution the cache locality once enjoyed by the original uniprocessor execution is significantly disrupted: for TLS execution on a four-processor CMP, we find that the data-cache miss rates are nearly four-times those of the uniprocessor case, even though TLS execution utilizes four private data caches (i.e., four-fold greater cache capacity). We break down the TLS cache locality problem into instruction and data cache, execution stages, and parallel access patterns, and propose methods to improve cache locality in each of these areas. We find that for parallel regions across 13 SPECint applications our simple and low-cost techniques reduce data-cache misses by 38%, improve performance by 12.8%, and significantly improve scalability - further enhancing the feasibility of TLS as a way to capitalize on future CMPs
  • Keywords
    cache storage; microprocessor chips; multi-threading; multiprocessing systems; SPECint; cache locality; chip multiprocessors; parallel access patterns; thread-level speculation; Hardware; Program processors; Sun; Throughput; Yarn;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing Symposium, 2006. IPDPS 2006. 20th International
  • Conference_Location
    Rhodes Island
  • Print_ISBN
    1-4244-0054-6
  • Type

    conf

  • DOI
    10.1109/IPDPS.2006.1639271
  • Filename
    1639271