DocumentCode :
3525415
Title :
On the efficiency of reductions in μ-SIMD media extensions
Author :
Corbal, Jesus ; Espasa, Roger ; Valero, Mateo
Author_Institution :
Dept. d´´Arquitectura de Computadors, Univ. Politecnica de Catalunya, Barcelona, Spain
fYear :
2001
fDate :
2001
Firstpage :
83
Lastpage :
94
Abstract :
Many important multimedia applications contain a significant fraction of reduction operations. Although, in general, multimedia applications are characterized for having high amounts of Data Level Parallelism, reductions and accumulations are difficult to parallelize and show a poor tolerance to increases in the latency of the instructions. This is specially significant for μ-SIMD extensions such as MMX or AltiVec. To overcome the problem of reductions in μ-SIMD ISAs, designers tend to include more and more complex instructions able to deal with the most common forms of reductions in multimedia. As long as the number of processor pipeline stages grows, the number of cycles needed to execute these multimedia instructions increases with every processor generation, severely compromising performance. The paper presents an in-depth discussion of how reductions/accumulations are performed in current μ-SIMD architectures and evaluates the performance trade-offs for near-future highly aggressive superscalar processors with three different styles of μ-SIMD extensions. We compare a MMX-like alternative to a MDMX-like extension that has packed accumulators to attack the reduction problem, and we also compare it to MOM, a matrix register ISA. We show that while packed accumulators present several advantages, they introduce artificial recurrences that severely degrade performance for processors with high number of registers and long latency operations. On the other hand, the paper demonstrates that longer SIMD media extensions such as MOM can take great advantage of accumulators by exploiting the associative parallelism implicit in reductions
Keywords :
instruction sets; multimedia systems; parallel architectures; pipeline processing; μ-SIMD media extension reductions; AltiVec; Data Level Parallelism; MDMX-like extension; MMX; MMX-like alternative; SIMD media extensions; associative parallelism; complex instructions; highly aggressive superscalar processors; matrix register ISA; multimedia applications; multimedia instructions; packed accumulators; performance trade-offs; processor generation; processor pipeline stages; reduction operations; Delay; Electric breakdown; Engines; Frequency; Instruction sets; Logic; Message-oriented middleware; Pipelines; Process design; Registers;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel Architectures and Compilation Techniques, 2001. Proceedings. 2001 International Conference on
Conference_Location :
Barcelona
ISSN :
1089-796X
Print_ISBN :
0-7695-1363-8
Type :
conf
DOI :
10.1109/PACT.2001.953290
Filename :
953290
Link To Document :
بازگشت