Title :
Understanding the Performance Benefit of Asynchronous Data Transfers in OpenCL Programs Executing on Media Processors
Author :
Nagendra Gulur;Suriya Narayanan L.
Author_Institution :
Texas Instrum., Bangalore, India
Abstract :
In this work, we study the performance benefits of using asynchronous data transfers in OpenCL programs executing on media processors. Asynchronous data transfers are typically implemented by use of Direct Memory Access (DMA) engines that can be programmed to transfer data from one memory location to another. Asynchronous transfers can free up processing cores from managing data transfers and having to wait for transfer completion. In a typical programming model using asynchronous transfers, the kernel uses a double-buffering scheme wherein data is moved to/from one buffer ("scratch-pad") while the core operates on the other buffer. Intuitively, this model allows the cost of data transfers to be hidden or overlapped with computation. This is in contrast with accessing data "through the cache". Here, the core executes loads and stores to access the required data. Due to the inherent spatial and temporal locality of accesses, the cache hierarchy plays a significant role in mitigating the cost/delay associated with frequent off-chip accesses. In this work, we seek to understand the performance gains expected with use of asynchronous transfers in a typical media processor. To do so, we first develop a simple yet insightful model of performance that helps quantify the benefits of asynchronous transfers over cache-based accesses in such processors. Next, we experimentally evaluate these programming styles on a variety of kernels executing on the Texas Instruments Keystone-II multi-core DSP platform. We observe that asynchronous data transfers can improve performance of image-processing OpenCL programs by as much as 5fi, with an average improvement of 40%.
Keywords :
"Kernel","Data transfer","Program processors","Media","Computational modeling","Image processing","Engines"
Conference_Titel :
High Performance Computing (HiPC), 2015 IEEE 22nd International Conference on
DOI :
10.1109/HiPC.2015.14