DocumentCode
2575936
Title
Optimization of Automatic Conversion of Serial C to Parallel OpenMP
Author
Dheeraj, D. ; Nitish, B. ; Ramesh, Shruti
Author_Institution
Dept. of ISE, PES Inst. of Technol., Bangalore, India
fYear
2012
fDate
10-12 Oct. 2012
Firstpage
309
Lastpage
314
Abstract
This paper implements a technique that enhances parallel execution of auto-generated OpenMP programs by considering architecture of on chip cache memory. It avoids false-sharing in ´for-loops´ by generating OpenMP code for dynamically scheduling chunks by placing each core´s data cache line size apart. An open-source parallelization tool called Par4All has been analyzed and its power has been unleashed to achieve maximum hardware utilization. Some of the computationally intensive programs from Poly Bench have been tested on different architectures, with different data sets and the results obtained reveal that the OpenMP codes generated by the enhanced technique have resulted in considerable speedup.
Keywords
C language; cache storage; message passing; parallel processing; public domain software; scheduling; OpenMP code; Par4All tool; Poly Bench; automatic program conversion; chunk scheduling; data cache line size; hardware utilization; message passing; on chip cache memory architecture; open-source parallelization tool; parallel OpenMP program; parallel execution; serial C program; Algorithm design and analysis; Cache memory; Computers; Dynamic scheduling; Memory management; Optimization; PIPS; Par4All; PoCC; PolyBench; cache line size; false sharing; on-chip cache;
fLanguage
English
Publisher
ieee
Conference_Titel
Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), 2012 International Conference on
Conference_Location
Sanya
Print_ISBN
978-1-4673-2624-7
Type
conf
DOI
10.1109/CyberC.2012.59
Filename
6384986
Link To Document