مرکز منطقه ای اطلاع رساني علوم و فناوري - Compiling C/C++ SIMD Extensions for Function and Loop Vectorizaion on Multicore-SIMD Processors

DocumentCode :

2996208

Title :

Compiling C/C++ SIMD Extensions for Function and Loop Vectorizaion on Multicore-SIMD Processors

Author :

Tian, Xinmin ; Saito, Hideki ; Girkar, Milind ; Preis, Serguei V. ; Kozhukhov, Sergey S. ; Cherkasov, Aleksei G. ; Nelson, Clark ; Panchenko, Nikolay ; Geva, Robert

Author_Institution :

Mobile Comput. & Compilers Software & Service Group, Intel Corp., Santa Clara, CA, USA

fYear :

2012

fDate :

21-25 May 2012

Firstpage :

2349

Lastpage :

2358

Abstract :

SIMD vectorization has received significant attention in the past decade as an important method to accelerate scientific applications, media and embedded applications on SIMD architectures such as Intel^® SSE, AVX, and IBM* AltiVec. However, most of the focus has been directed at loops, effectively executing their iterations on multiple SIMD lanes concurrently relying upon program hints and compiler analysis. This paper presents a set of new C/C++ high-level vector extensions for SIMD programming, and the Intel^® C++ product compiler that is extended to translate these vector extensions and produce optimized SIMD instruction sequences of vectorized functions and loops. For a function, our main idea is to vectorize the entire function for callers instead of just vectorizing loops (if any) inside the function. It poses the challenge of dealing with complicated control-flow in the function body, and matching caller and callee for SIMD vector calls while vectorizing caller functions (or loops) and callee functions. Our compilation methods for automatically compiling vector extensions are described. We present performance results of several non-trivial visual computing, computational, and simulation workloads, utilizing SIMD units through the vector extensions on Intel^® Multicore 128-bit SIMD processors, and we show that significant SIMD speedups (3.07x to 4.69x) are achieved over the serial execution.

Keywords :

embedded systems; iterative methods; microprocessor chips; multiprocessing systems; parallel architectures; program compilers; AVX; C-C++ SIMD extension compilation; C-C++ high-level vector extensions; IBM AltiVec; Intel C++ product compiler; Intel SSE; Intel multicore 128-bit SIMD processors; SIMD architectures; SIMD vector calls; compiler analysis; embedded applications; function vectorizaion; iterations; loop vectorizaion; media applications; optimized SIMD instruction sequences; scientific applications; visual computing; Cloning; Graphics processing unit; Hardware; Parallel processing; Programming; Vectors; Compiler; GPU; Multicore; SIMD; Vectorization;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012 IEEE 26th International

Conference_Location :

Shanghai

Print_ISBN :

978-1-4673-0974-5

Type :

conf

DOI :

10.1109/IPDPSW.2012.292

Filename :

6270606

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2996208