DocumentCode :
676340
Title :
Efficient execution of augmented reality applications on mobile programmable accelerators
Author :
Park, Jason Jong Kyu ; Yongjun Park ; Mahlke, Scott
Author_Institution :
Adv. Comput. Archit. Lab., Univ. of Michigan, Ann Arbor, MI, USA
fYear :
2013
fDate :
9-11 Dec. 2013
Firstpage :
176
Lastpage :
183
Abstract :
Mobile devices are ubiquitous in daily lives. From smartphones to tablets, customers are constantly demanding richer user experiences through more visual and interactive interface with prolonged battery life. To meet the demands, accelerators are commonly adopted in system-on-chip (SoC) for various applications. Coarse-grained reconfigurable architecture (CGRA) is a promising solution, which accelerates hot loops with software pipelining. Although CGRAs have shown that they can support multimedia applications efficiently, more interactive applications such as augmented reality put much more pressure on performance and energy requirements. In this paper, we extend heterogeneous CGRA to provide SIMD capabilities, which improves performance and energy efficiency significantly for augmented reality applications. We show that if we can exploit data level parallelism (DLP), it is more beneficial to run on SIMD natively than to transform it into instruction level parallelism (ILP) and run on CGRA. To utilize this property, multiple processing elements in CGRA are grouped to form homogeneous SIMD cores. To reduce the hardware overhead of fetching and replicating configuration in SIMD mode, we propose a ring network and a recycle buffer to pass the configuration around as well as to temporarily store it, which has minimized impact on throughput. Also, we modify memory access units and memory banks to support split memory transactions with forwarding for handling SIMD data access. To adapt to the proposed extension, we introduce a compile technique for SIMD mode code generation to maximize the resource utilization of each SIMD core. Experimental results show that it is possible to achieve an average of 17.6% performance improvement while saving 16.9% energy over heterogeneous CGRA.
Keywords :
augmented reality; mobile computing; parallel processing; performance evaluation; pipeline processing; power aware computing; program compilers; program control structures; reconfigurable architectures; smart phones; system-on-chip; CGRA; DLP; ILP; SIMD capabilities; SIMD core resource utilization; SIMD cores; SIMD data access handling; SIMD mode code generation; SoC; augmented reality applications; battery life; coarse-grained reconfigurable architecture; data level parallelism; energy efficiency; energy requirements; fetching configuration; heterogeneous CGRA; hot loops; instruction level parallelism; interactive interface; memory access units; memory banks; mobile devices; mobile programmable accelerators; multimedia applications; performance improvement; recycle buffer; replicating configuration; resource utilization; ring network; smartphones; software pipelining; system-on-chip; tablets; user experiences; visual interface; Acceleration; Augmented reality; Buffer storage; Computer architecture; Recycling; Schedules; Software;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Field-Programmable Technology (FPT), 2013 International Conference on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4799-2199-7
Type :
conf
DOI :
10.1109/FPT.2013.6718350
Filename :
6718350
Link To Document :
بازگشت