Title :
A Compile-Time Cost Model for Automatic OpenMP Decoupled Software Pipelining Parallelization
Author :
Xiaoxian Liu ; Rongcai Zhao ; Lin Han
Author_Institution :
State Key Lab. of Math. Eng. & Adv. Comput., Zhengzhou, China
Abstract :
The prevalence of control flow, recursive data structures, and general pointer accesses in ordinary programs renders the traditional automatic parallelization techniques unsuitable. OpenMP Decoupled Software Pipelining (DSWP) is proposed to exploit pipeline parallelism lurking in ordinary programs, which cannot be dealt with by traditional techniques. While cost model is important in helping evaluate compiler transformations, guiding the compiler in its optimization process and helping achieve load balancing, existing cost models are too simple to be sufficient for the profit evaluation of OpenMP, especially for DSWPed loops. We propose a compile-time cost model for automatic parallelization profit estimate by extending the existing cost model in Open64 loop nest optimizer (LNO) phase in this paper. Moreover, we improve the OpenMP DSWP algorithm based on our cost model, which increases execution efficiency of automatic parallelization. We evaluate our cost model with loops containing complex memory access patterns and control flow structure, which cannot be dealt with by traditional techniques, and NAS Parallel Benchmarks (NPB) 3.3.1. As a result, evident performance improvement for generated DSWPed loops and programs are obtained by using our model.
Keywords :
application program interfaces; optimising compilers; pipeline processing; program control structures; public domain software; software cost estimation; DSWPed loops; LNO phase; NAS parallel benchmark 3.3.1; NPB 3.3.1; Open64 loop nest optimizer phase; OpenMP DSWP algorithm; OpenMP profit evaluation; automatic OpenMP decoupled software pipelining parallelization; automatic parallelization execution efficiency improvement; automatic parallelization profit estimation; compile-time cost model; compiler optimization process; compiler transformation evaluation; complex memory access patterns; control flow structure; general pointer access; load balancing; performance improvement; recursive data structures; Algorithm design and analysis; Computational modeling; Load modeling; Mathematical model; Parallel processing; Partitioning algorithms; Synchronization; Automatic Parallelization; OpenMP; Decoupled Software Pipelining; Cost Model;
Conference_Titel :
Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), 2013 14th ACIS International Conference on
Conference_Location :
Honolulu, HI
DOI :
10.1109/SNPD.2013.8