DocumentCode :
2518110
Title :
Simple vector microprocessors for multimedia applications
Author :
Lee, Corinna G. ; Stoodley, Mark G.
Author_Institution :
Dept. of Electr. & Comput. Eng., Toronto Univ., Ont., Canada
fYear :
1998
fDate :
30 Nov-2 Dec 1998
Firstpage :
25
Lastpage :
36
Abstract :
In anticipation of the emergence of multimedia applications as an important workload, microprocessor companies have augmented their instruction-set architectures with short vector extensions, thus adding basic vector hardware to state-of-the-art superscalar processors. Although a vector architecture may be a good match for multimedia applications, there is growing evidence that the control logic for increasingly complex superscalar processors is difficult to implement, Rather than combining a complex superscalar core with short wide vector hardware, we propose using a much simpler processor design that is similar to traditional vector computers with long vectors and simple control logic for instruction issue. Such a design would use the bulk of its transistors and die area for datapath and registers, and thus lessen the time required to design, implement, and verify control. In this paper we present data that quantifies this trading of control transistors for datapath and register transistors. We demonstrate that a 2-way, in-order vector processor with a vector length of 64 and a vector width of 8 requires no more die area, and possibly significantly less area, than a 4-way, out-of-order superscalar processor with short vector extensions. Furthermore, we show that the simple long vector processor is, on average, 2.7 times faster executing multimedia applications than the superscalar processor; and 1.6 times faster than one with short vector extensions. To explain the reasons for the higher performance, we analyze execution time in terms of dynamic operation count and cycles per operation (CPO). A vector processor executes fewer operations by using vector instructions to stripmine a loop. Moreover, a long vector processor achieves a lower CPO by effectively using parallelism at both the operation and the instruction levels. Thus by reducing both terms of the CPO equation, the simple long vector processor achieves greater performance
Keywords :
microprocessor chips; multimedia systems; vector processor systems; instruction-set architectures; multimedia applications; superscalar processor; superscalar processors; vector microprocessors; Application software; Computer aided instruction; Hardware; Logic design; Microprocessors; Out of order; Performance analysis; Process design; Registers; Vector processors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Microarchitecture, 1998. MICRO-31. Proceedings. 31st Annual ACM/IEEE International Symposium on
Conference_Location :
Dallas, TX
ISSN :
1072-4451
Print_ISBN :
0-8186-8609-X
Type :
conf
DOI :
10.1109/MICRO.1998.742766
Filename :
742766
Link To Document :
بازگشت