Title :
Inexpensive throughput enhancement in small-scale embedded microprocessors with block multithreading: extensions, characterization, and tradeoffs
Author :
Haskins, John W., Jr. ; Hirst, Kevin R. ; Skadron, Kevin
Author_Institution :
Dept. of Comput. Sci., Virginia Univ., Charlottesville, VA, USA
Abstract :
This paper examines differential multithreading (DMT) as an attractive organization for coping with pipeline stalls in small-scale processors like those used in embedded environments. The paper proposes extensions to block multithreading to cope with data- and instruction-cache misses, and then explores some of the design tradeoffs that this enables. Results show that DMT boosts throughput substantially and can in fact replace dynamic branch prediction or data forwarding, or can be used to reduce the sizes of the instruction and data caches. Block multithreading, described by Farrens and Pleszkun (1991), is a technique to achieve high throughput from a single-issue microarchitecture by switching among multiple instruction streams in response to pipeline stalls. Although single-issue organizations are no longer used in high-performance processors, they remain common even in newly-designed processors for small-scale, embedded devices. Like the original description of block multithreading, DMT uses auxiliary pipeline registers to save the state of in-flight instructions. By coping with data- and instruction-cache misses, however, our implementation can attack all the major sources of pipeline stalls. Overall, we find that DMT can substantially lower the cost and complexity of microprocessors for embedded environments, especially environments for which throughput rather than speed is the primary concern. In addition, DMT is an attractive prospect for use in chip-multiprocessing environments
Keywords :
cache storage; embedded systems; fault tolerant computing; microprocessor chips; multi-threading; pipeline processing; auxiliary pipeline registers; block multithreading; chip-multiprocessing environments; data-cache misses; design tradeoffs; differential multithreading; in-flight instructions; instruction-cache misses; multiple instruction stream switching; pipeline stall; single-issue microarchitecture; small-scale embedded microprocessors; throughput enhancement; Computer science; Games; Hardware; Microprocessors; Multithreading; OFDM modulation; Pipelines; Switches; Throughput; Yarn;
Conference_Titel :
Performance, Computing, and Communications, 2001. IEEE International Conference on.
Conference_Location :
Phoenix, AZ
Print_ISBN :
0-7803-7001-5
DOI :
10.1109/IPCCC.2001.918669