Title :
A compilation technique and performance profits for VLIW with heterogeneous vectors
Author :
Diken, Erkan ; Jozwiak, Lech
Author_Institution :
Eindhoven Univ. of Technol., Eindhoven, Netherlands
Abstract :
In numerous mobile applications involving complex video, image, signal, communication or security processing, massive parallelism is mainly in the form of data-level parallelism (DLP). However, the sorts and amount of DLP parallelism in applications vary due to different computational characteristics of applications. On the contrary, most of the processors today include single-width SIMD (vector) hardware to exploit DLP. However, single-width SIMD architectures may not be optimal to serve applications with varying DLP and they may cause performance and energy inefficiency. We propose the usage of VLIW processors with multiple native vector-widths to better serve applications with changing DLP. This paper focuses on the short SIMD code generation. More specifically, we target generating 32-bit SIMD code for the native 32-bit wide vector units of our example processor. In this way, we improved the performance of compiler generated SIMD code by reducing the number of overhead operations. Experimental results demonstrated that our methodology implemented in the compiler reduces the number of operations of synthetic benchmarks up to 40%.
Keywords :
parallel processing; program compilers; software architecture; DLP; SIMD code generation; VLIW processor; compilation technique; data-level parallelism; heterogeneous vector; performance profit; single-width SIMD architecture; very long instruction word architecture; Benchmark testing; Computer architecture; Hardware; Mobile communication; Parallel processing; Program processors; VLIW;
Conference_Titel :
Embedded Computing (MECO), 2015 4th Mediterranean Conference on
Conference_Location :
Budva
Print_ISBN :
978-1-4799-8999-7
DOI :
10.1109/MECO.2015.7181860