• DocumentCode
    3291912
  • Title

    Initial results on the performance and cost of vector microprocessors

  • Author

    Lee, Corinna G. ; deVries, D.J.

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Toronto Univ., Ont., Canada
  • fYear
    1997
  • fDate
    1-3 Dec 1997
  • Firstpage
    171
  • Lastpage
    182
  • Abstract
    Increasingly wider superscalar processors are experiencing diminishing performance returns while requiring larger portions of die area dedicated to control rather than datapath. As an alternative to using these processors to exploit parallelism effectively, we are investigating the viability of using single-chip vector microprocessors. This paper presents some initial results of our investigation where we compare the performance and cost of vector microprocessors to that of aggressive, out-of-order superscalar microprocessors. On the performance side, we show that vector processors are able to execute a highly parallel, integer-based application 1.5-7.3 times faster than superscalar processors can by exploiting parallelism more effectively. This ability stems from the use of vector instructions. Vector instructions exploit parallelism across loop iterations by implicitly re-scheduling operations and temporally localizing the parallelism. Vector instructions also reduce instruction bandwidth by more than an order of magnitude because they express an abundance of parallelism in a compact encoding. On the cost side we show that, to achieve these performance gains, highly parallel, integer-based vector microprocessors are no more costly to implement than existing in-order and out-of-order superscalar microprocessors. One reason for this is that the organization of a vector register file provides tremendous bandwidth without incurring a large area penalty. A second reason is that the control logic for issuing vector instructions is relatively simple. Both the performance gains and cost savings are possible because vector processors rely on a vectorizing compiler, rather than hardware, to detect parallelism and to express it in a compact form to the hardware. These initial results suggest that transferring this functionality to the compiler offers a tremendous performance/cost benefit
  • Keywords
    instruction sets; microprocessor chips; parallel architectures; parallelising compilers; performance evaluation; vector processor systems; instruction level parallelism; integer-based application; loop iterations; microprocessor instruction sets; out-of-order superscalar microprocessors; parallelism; performance; single-chip vector microprocessors; superscalar processors; vector microprocessors; Bandwidth; Costs; Encoding; Hardware; Logic; Microprocessors; Out of order; Performance gain; Registers; Vector processors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Microarchitecture, 1997. Proceedings., Thirtieth Annual IEEE/ACM International Symposium on
  • Conference_Location
    Research Triangle Park, NC
  • ISSN
    1072-4451
  • Print_ISBN
    0-8186-7977-8
  • Type

    conf

  • DOI
    10.1109/MICRO.1997.645808
  • Filename
    645808