DocumentCode :
2575936
Title :
Optimization of Automatic Conversion of Serial C to Parallel OpenMP
Author :
Dheeraj, D. ; Nitish, B. ; Ramesh, Shruti
Author_Institution :
Dept. of ISE, PES Inst. of Technol., Bangalore, India
fYear :
2012
fDate :
10-12 Oct. 2012
Firstpage :
309
Lastpage :
314
Abstract :
This paper implements a technique that enhances parallel execution of auto-generated OpenMP programs by considering architecture of on chip cache memory. It avoids false-sharing in ´for-loops´ by generating OpenMP code for dynamically scheduling chunks by placing each core´s data cache line size apart. An open-source parallelization tool called Par4All has been analyzed and its power has been unleashed to achieve maximum hardware utilization. Some of the computationally intensive programs from Poly Bench have been tested on different architectures, with different data sets and the results obtained reveal that the OpenMP codes generated by the enhanced technique have resulted in considerable speedup.
Keywords :
C language; cache storage; message passing; parallel processing; public domain software; scheduling; OpenMP code; Par4All tool; Poly Bench; automatic program conversion; chunk scheduling; data cache line size; hardware utilization; message passing; on chip cache memory architecture; open-source parallelization tool; parallel OpenMP program; parallel execution; serial C program; Algorithm design and analysis; Cache memory; Computers; Dynamic scheduling; Memory management; Optimization; PIPS; Par4All; PoCC; PolyBench; cache line size; false sharing; on-chip cache;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), 2012 International Conference on
Conference_Location :
Sanya
Print_ISBN :
978-1-4673-2624-7
Type :
conf
DOI :
10.1109/CyberC.2012.59
Filename :
6384986
Link To Document :
بازگشت