Title :
Optimizing a 3D-FWT Code in a Heterogeneous Cluster of Multicore CPUs and Manycore GPUs
Author :
Bernabe, Gregorio ; Cuenca, Jerome ; Gimenez, D.
Author_Institution :
Comput. Eng. Dept., Univ. of Murcia, Murcia, Spain
Abstract :
Clusters of nodes composed of many core GPUs and multicore CPUs are used to solve scientific problems with high computational requirements. The development and optimization of parallel-heterogeneous codes for these systems is a complex task which requires a deep knowledge of the different components of the hybrid, heterogeneous and hierarchical computational system, and also of the scientific problem to be solved and the different programing paradigms to be used for its efficient solution. Techniques for efficient development and optimization of scientific codes for these systems are needed. This paper presents an analysis of the development and optimization of the 3D-Fast Wavelet Transform (3D-FWT) for a heterogeneous cluster of multicores+GPUs. Different parallel programming paradigms (message passing, shared memory and SIMD GPU) are combined to fully exploit the computing capacity of the different computational elements of the cluster, so resulting in an efficient combination of basic codes developed previously for individual components (individual nodes, multicore or GPU) and an important reduction of the compression time of long video sequences.
Keywords :
graphics processing units; image sequences; message passing; parallel programming; shared memory systems; video signal processing; wavelet transforms; 3D-FWT code; 3D-fast wavelet transform; SIMD GPU; compression time reduction; computational elements; computing capacity; heterogeneous cluster; heterogeneous computational system; hierarchical computational system; hybrid computational system; long video sequences; manycore GPU; message passing; multicore CPU; parallel programming paradigms; parallel-heterogeneous codes; scientific codes; shared memory; Graphics processing units; Image resolution; Kernel; Multicore processing; Optimization; 3D-FWT; autotuning engine; cluster; manycore GPUs; multicore CPUs;
Conference_Titel :
Computer Architecture and High Performance Computing (SBAC-PAD), 2013 25th International Symposium on
Conference_Location :
Porto de Galinhas
Print_ISBN :
978-1-4799-2927-6
DOI :
10.1109/SBAC-PAD.2013.26