DocumentCode
3082521
Title
Estimating the WCET of GPU-Accelerated Applications Using Hybrid Analysis
Author
Betts, Alexander ; Donaldson, Alastair
Author_Institution
Dept. of Comput., Imperial Coll. London, London, UK
fYear
2013
fDate
9-12 July 2013
Firstpage
193
Lastpage
202
Abstract
The massive parallelism offered by Graphics Processing Units (GPUs) is now routinely exploited to accelerate computationally intensive tasks in a wide variety of application domains. Efficient GPU programming in languages such as CUDA and OpenCL requires careful application of hand optimisations to exploit parallelism and locality while minimising synchronisation. The effectiveness of such optimisations can be highly dependent on workload and the structure of input data, making it difficult to assess performance in general by testing alone. To address this, we study the problem of estimating the Worst-Case Execution Time (WCET) of GPU-accelerated applications. We propose the use of hybrid WCET analysis whereby execution times of small program segments are deduced from traces of execution and a calculation backend derived from the Control Flow Graph (CFG) produces a WCET estimate. Standard techniques which construct a CFG from a binary cannot be applied directly to GPU code because they miss implicit execution paths that arise due the way branches are implemented in hardware - we present a solution using standard compiler analysis. We further describe how to extend the basic hybrid WCET analysis of sequential code so that concurrent timing effects in the GPU execution model are incorporated. We have implemented our analysis as a tool built on top of the GPGPU-sim open source simulator. We evaluate our tool using a set of benchmarks drawn from the CUDA SDK: results show that effective modelling of concurrency is key to reducing pessimism in the WCET calculation.
Keywords
concurrency control; flow graphs; graphics processing units; minimisation; parallel architectures; program compilers; CFG; CUDA SDK; GPGPU-sim open source simulator; GPU programming; GPU-accelerated application; OpenCL; benchmark; compiler analysis; concurrency modelling; control flow graph; graphics processing units; hand optimisation; hybrid WCET analysis; hybrid analysis; synchronisation minimisation; worst-case execution time estimation; Analytical models; Graphics processing units; Hardware; Instruction sets; Instruments; Programming; Standards;
fLanguage
English
Publisher
ieee
Conference_Titel
Real-Time Systems (ECRTS), 2013 25th Euromicro Conference on
Conference_Location
Paris
Type
conf
DOI
10.1109/ECRTS.2013.29
Filename
6602100
Link To Document