DocumentCode :
2996208
Title :
Compiling C/C++ SIMD Extensions for Function and Loop Vectorizaion on Multicore-SIMD Processors
Author :
Tian, Xinmin ; Saito, Hideki ; Girkar, Milind ; Preis, Serguei V. ; Kozhukhov, Sergey S. ; Cherkasov, Aleksei G. ; Nelson, Clark ; Panchenko, Nikolay ; Geva, Robert
Author_Institution :
Mobile Comput. & Compilers Software & Service Group, Intel Corp., Santa Clara, CA, USA
fYear :
2012
fDate :
21-25 May 2012
Firstpage :
2349
Lastpage :
2358
Abstract :
SIMD vectorization has received significant attention in the past decade as an important method to accelerate scientific applications, media and embedded applications on SIMD architectures such as Intel® SSE, AVX, and IBM* AltiVec. However, most of the focus has been directed at loops, effectively executing their iterations on multiple SIMD lanes concurrently relying upon program hints and compiler analysis. This paper presents a set of new C/C++ high-level vector extensions for SIMD programming, and the Intel® C++ product compiler that is extended to translate these vector extensions and produce optimized SIMD instruction sequences of vectorized functions and loops. For a function, our main idea is to vectorize the entire function for callers instead of just vectorizing loops (if any) inside the function. It poses the challenge of dealing with complicated control-flow in the function body, and matching caller and callee for SIMD vector calls while vectorizing caller functions (or loops) and callee functions. Our compilation methods for automatically compiling vector extensions are described. We present performance results of several non-trivial visual computing, computational, and simulation workloads, utilizing SIMD units through the vector extensions on Intel® Multicore 128-bit SIMD processors, and we show that significant SIMD speedups (3.07x to 4.69x) are achieved over the serial execution.
Keywords :
embedded systems; iterative methods; microprocessor chips; multiprocessing systems; parallel architectures; program compilers; AVX; C-C++ SIMD extension compilation; C-C++ high-level vector extensions; IBM AltiVec; Intel C++ product compiler; Intel SSE; Intel multicore 128-bit SIMD processors; SIMD architectures; SIMD vector calls; compiler analysis; embedded applications; function vectorizaion; iterations; loop vectorizaion; media applications; optimized SIMD instruction sequences; scientific applications; visual computing; Cloning; Graphics processing unit; Hardware; Parallel processing; Programming; Vectors; Compiler; GPU; Multicore; SIMD; Vectorization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012 IEEE 26th International
Conference_Location :
Shanghai
Print_ISBN :
978-1-4673-0974-5
Type :
conf
DOI :
10.1109/IPDPSW.2012.292
Filename :
6270606
Link To Document :
بازگشت