DocumentCode :
3245568
Title :
Exploiting Parallelism through High Level Optimization on a Heterogeneous Multicore SoC
Author :
Yan, Ming ; Zhao, Peng ; Yang, Ziyu ; Li, Sikun
Author_Institution :
Sch. of Comput., Nat. Univ. of Defense Technol., Changsha, China
fYear :
2009
fDate :
8-11 Dec. 2009
Firstpage :
527
Lastpage :
534
Abstract :
This paper describes a heterogeneous multicore SoC named EVMP-SoC, which is composed of a RISC host processor and two minor-different SIMD synergistic processor that are specially optimized for embedded visual media applications. By using on chip memory and the multi-channel memory access unit, this chip achieved several different level of parallelism, such as single-instructionstream-multiple-datastream (SIMD) data-level parallelism (DLP), multicore thread-level parallelism (TLP) and memory tile pipeline parallelism. We used an affine transformation framework called PLuTo on code optimization for EVMPSoC and explored multiple level parallelism on this chip. We found that lacking of processor performance model, the general polyhedral affine transformation framework could not generate efficient parallel code for heterogeneous architectures. Tile scheduling and pipelining techniques are adopted to make a full use of process cores and memory bandwidth. The experiment results showed that tile schedule and pipeline is effective. This chip gained a very good accelerate ratio after all the parallel optimizations. Finally, the chip was proved to be high efficiency and availability through a case study (a typical application of three dimensional reconstruction from multi images).
Keywords :
affine transforms; multi-threading; pipeline processing; reduced instruction set computing; scheduling; system-on-chip; EVMP-SoC; PLuTo framework; RISC host processor; SIMD synergistic processor; chip memory; code optimization; data-level parallelism; heterogeneous architecture; heterogeneous multicore SoC; high level optimization; memory bandwidth; memory tile pipeline parallelism; multichannel memory access unit; multicore thread-level parallelism; multiple level parallelism; polyhedral affine transformation; single-instructionstream-multiple-datastream; tile pipelining; tile scheduling; visual media application; Acceleration; Availability; Bandwidth; Multicore processing; Nonhomogeneous media; Pipeline processing; Pluto; Processor scheduling; Reduced instruction set computing; Tiles;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Systems (ICPADS), 2009 15th International Conference on
Conference_Location :
Shenzhen
ISSN :
1521-9097
Print_ISBN :
978-1-4244-5788-5
Type :
conf
DOI :
10.1109/ICPADS.2009.7
Filename :
5395336
Link To Document :
بازگشت