Title :
XFOR: Filling the Gap between Automatic Loop Optimization and Peak Performance
Author :
Fassi, Imen ; Clauss, Philippe
Author_Institution :
Fac. des Sci. de Tunis, Univ. El Manar, Tunis, Tunisia
fDate :
June 29 2015-July 2 2015
Abstract :
We propose a new loop structure named "xfor", offering programmers explicit control of the interactions between statements inside a loop nest. An xfor simultaneously represents several for-loops and several statements, and maps their respective iteration domains onto each other according to two parameters, called "grain" and "offset". Grains and offsets basically "stretch" and "shift" iteration domains relative to an implicit, global referential domain. We show that such a programming structure allows to fill important optimization gaps remained by automatic loop optimizers. We highlight five important gaps filled by xfor which are: insufficient data locality optimization, excess of conditional branches in the generated code, too verbose code with too many machine instructions, data locality optimization resulting in processor stalls, and finally missed factorization opportunities. We describe programming strategies where xfor-loops help produce efficient code and exhibit a set of benchmark programs rewritten with xfor, with significant, and sometimes dramatic, execution time speed-ups.
Keywords :
parallel processing; program compilers; XFOR; automatic loop optimization; automatic loop optimizer; benchmark program; data locality optimization; machine instruction; programming strategy; shift iteration domain; stretch iteration domain; verbose code; Benchmark testing; Indexes; Optimization; Pipelines; Pluto; Radiation detectors; Syntactics; loop optimization; optimizing compilers; polyhedral model; programming;
Conference_Titel :
Parallel and Distributed Computing (ISPDC), 2015 14th International Symposium on
Conference_Location :
Limassol
Print_ISBN :
978-1-4673-7147-6
DOI :
10.1109/ISPDC.2015.19