DocumentCode :
3244851
Title :
Reducing 3D wavelet transform execution time through the Streaming SIMD Extensions
Author :
Bernabé, Gregorio ; García, José M. ; González, José
Author_Institution :
Dpto. Ingenieria y Tecnologia de Computadores, Murcia Univ., Spain
fYear :
2003
fDate :
5-7 Feb. 2003
Firstpage :
49
Lastpage :
56
Abstract :
This paper focuses on reducing the execution time of the video compression algorithms based on the 3D wavelet transform. We present several optimizations that could not be applied by the compiler due to the characteristics of the algorithm. First, we use the Streaming SIMD Extensions (SSE) for some of the dimensions of the sequence (y and time), in order to reduce the number of floating point instructions, exploiting data level parallelism. Then, we apply loop unrolling and data prefetching to critical parts of the code, and finally the algorithm is vectorized by columns, allowing the use of SIMD instructions for the y dimension. Results show improvements of up to 1.54 over a version compiled with the maximum optimizations of the Intel CIC++ compiler Our experiments also show that, allowing the compiler to perform some of these optimizations (i.e. automatic code vectorization) causes performance slowdown which demonstrates the effectiveness of our optimizations.
Keywords :
data compression; optimising compilers; parallel processing; parallelising compilers; performance evaluation; program control structures; storage management; transform coding; video coding; wavelet transforms; 3D wavelet transform; SSE; Streaming SIMD Extensions; automatic code vectorization; compiler; data level parallelism; data prefetching; execution time reduction; floating point instructions; loop unrolling; optimizations; performance slowdown; streaming SIMD extensions; video compression algorithms; Biomedical imaging; Constraint optimization; Discrete wavelet transforms; Image coding; Optimizing compilers; Parallel processing; Prefetching; Streaming media; Video compression; Wavelet transforms;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel, Distributed and Network-Based Processing, 2003. Proceedings. Eleventh Euromicro Conference on
Conference_Location :
Genova, Italy
ISSN :
1066-6192
Print_ISBN :
0-7695-1875-3
Type :
conf
DOI :
10.1109/EMPDP.2003.1183565
Filename :
1183565
Link To Document :
بازگشت