DocumentCode :
2704872
Title :
Efficient SIMD code generation for runtime alignment and length conversion
Author :
Wu, Peng ; Eichenberger, Alexandre E. ; Wang, Amy
Author_Institution :
IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
fYear :
2005
fDate :
20-23 March 2005
Firstpage :
153
Lastpage :
164
Abstract :
When generating codes for today´s multimedia extensions, one of the major challenges is to deal with memory alignment issues. While hand programming still yields best performing SIMD codes, it is both time consuming and error prone. Compiler technology has greatly improved, including techniques that simdize loops with misaligned accesses by automatically rearranging misaligned memory streams in registers. Current techniques are applicable to runtime alignments, but they aggressively reduce the alignment overhead only when all alignments are known at compile time. This paper presents two major enhancements to the state of the art, improving both performance and coverage. First, we propose a novel technique to simdize loops with runtime alignment nearly as efficiently as those with compile-time misalignment. Runtime alignment is pervasive in real applications because it is either part of the algorithms, or it is an artifact of the compiler´s inability to extract accurate alignment information from complex applications. Second, we incorporate length conversion operations, e.g., conversions between data of different sizes, into the alignment handling framework. Length conversions are pervasive in multimedia applications where mixed integer types are often used. Supporting length conversion can greatly improve the coverage of simdizable loops. Experimental results indicate that our runtime alignment technique achieves a 19% to 32% speedup increase over prior art for a benchmark stressing the impact of misaligned data. We also demonstrate speedup factors of up to 8.11 for real benchmarks over sequential execution.
Keywords :
instruction sets; parallel processing; program compilers; program control structures; SIMD code generation; length conversion operation; program compiler; runtime alignment; simdizable loop; Art; Costs; Data mining; Graphics; Hardware; Laboratories; Memory management; Registers; Runtime; Streaming media;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Code Generation and Optimization, 2005. CGO 2005. International Symposium on
Print_ISBN :
0-7695-2298-X
Type :
conf
DOI :
10.1109/CGO.2005.18
Filename :
1402085
Link To Document :
بازگشت