DocumentCode
2572708
Title
Transparent threads: resource sharing in SMT processors for high single-thread performance
Author
Dorai, Gautham K. ; Yeung, Donald
Author_Institution
Electr. & Comput. Eng. Dept., Univ. of Maryland, College Park, MD, USA
fYear
2002
fDate
2002
Firstpage
30
Lastpage
41
Abstract
To realize transparent threads, we propose three mechanisms for maintaining the transparency of background threads: slot prioritization, background thread instruction-window partitioning, and background thread flushing. In addition, we propose three mechanisms to boost background thread performance without sacrificing transparency: aggressive fetch partitioning, foreground thread instruction-window partitioning, and foreground thread flushing. We implement our mechanisms on a detailed simulator of an SMT processor and evaluate them using 8 benchmarks, including 7 from the SPEC CPU2000 suite. Our results show when cache and branch predictor interference are factored out, background threads introduce less than 1% performance degradation on the foreground thread Furthermore, maintaining the transparency of background threads reduces their throughput by only 23% relative to an equal priority scheme. To demonstrate the usefulness of transparent threads, we study transparent software prefetching (TSP), an implementation of software data prefetching using transparent threads. Due to its near-zero overhead, TSP enables prefetch instrumentation for all loads in a program, eliminating the need for profiling. TSP without any profile information, achieves a 9.52% gain across 6 SPEC benchmarks, whereas conventional software prefetching guided by cache-miss profiles increases performance by only 2.47%.
Keywords
multi-threading; parallel architectures; software performance evaluation; storage management; SMT processors; SPEC CPU2000 suite; aggressive fetch partitioning; background thread flushing; background thread instruction-window partitioning; foreground thread flushing; foreground thread instruction-window partitioning; resource sharing; single-thread performance; slot prioritization; software data prefetching; transparent software prefetching; transparent threads; Degradation; Instruments; Interference; Performance gain; Prefetching; Resource management; Software performance; Surface-mount technology; Throughput; Yarn;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel Architectures and Compilation Techniques, 2002. Proceedings. 2002 International Conference on
ISSN
1089-795X
Print_ISBN
0-7695-1620-3
Type
conf
DOI
10.1109/PACT.2002.1105971
Filename
1105971
Link To Document