Title :
Real-time face detection in Full HD images exploiting both embedded CPU and GPU
Author :
Chanyoung Oh ; Saehanseul Yi ; Youngmin Yi
Author_Institution :
Sch. of Electr. & Comput. Eng., Univ. of Seoul, Seoul, South Korea
fDate :
June 29 2015-July 3 2015
Abstract :
CPU-GPU heterogeneous systems have become a mainstream platform in both server and embedded domains with ever increasing demand for powerful accelerator. In this paper, we present parallelization techniques that exploit both data and task parallelism of LBP based face detection algorithm on an embedded heterogeneous platform. By running tasks in a pipelined parallel way on multicore CPUs and by offloading a data-parallel task to a GPU, we could successfully achieve 29 fps for Full HD inputs on Tegra K1 platform where quad-core Cortex-A15 CPU and CUDA supported 192-core GPU are integrated. This corresponds to 5.54x speedup over a sequential version and 1.69x speedup compared to the GPU-only implementations.
Keywords :
embedded systems; face recognition; graphics processing units; parallel architectures; CUDA; GPU-only implementations; LBP based algorithm; Tegra K1 platform; data parallelism; embedded domains; embedded heterogeneous platform; full HD images; mainstream platform; multicore CPU; quad-core Cortex-A15; real-time face detection; server domains; task parallelism; Acceleration; Detectors; Face; Face detection; Graphics processing units; Kernel; Real-time systems; CPU-GPU heterogeneous platform; Face detection; Tegra K1; data-parallel; task-parallel;
Conference_Titel :
Multimedia and Expo (ICME), 2015 IEEE International Conference on
Conference_Location :
Turin
DOI :
10.1109/ICME.2015.7177522