DocumentCode
166698
Title
High-performance X-ray tomography reconstruction algorithm based on heterogeneous accelerated computing systems
Author
Serrano, Estefania ; Bermejo, Guzman ; Blas, Javier Garcia ; Carretero, Jesus
Author_Institution
Univ. Carlos III of Madrid, Leganes, Spain
fYear
2014
fDate
22-26 Sept. 2014
Firstpage
331
Lastpage
338
Abstract
Many medical image processing applications need high processing speed to achieve almost real-time image reconstruction features. Due to that, massively parallel architectures based on accelerators have become very popular in the area, specially GPGPUs. In this paper we show Mangoose++, an application to perform X-Ray Computed Tomography (CT) from medical image based on a new implementation of the FDK algorithm. Mangoose++ have been designed and implemented to exploit the parallelism existing on several hardware accelerators platforms, as GPGPUs and Intel Xeon Phi accelerators. In this paper we show the design and implementation of the application in three types of platforms, multi-core CPU, GPGPU, and Intel Xeon Phi, and the evaluation made to test the performance, resource utilization, and scalability of each platform. Moreover, to avoid hardware dependencies, we have also implemented the application using the OpenACC runtime to check portability and the overhead incurred when using runtimes. The evaluation results show that our solution is faster than recent related works and that, in terms of computation, Intel Xeon Phi and the CUDA-based GPU versions obtain similar results as the problem size increases. Moreover, the evaluation shows that using OpenACC, we have enhanced programmability because there is a single version of the source code. But it also shows that using OpenACC heavily affects performance of Mangoose++, which is reduced in a 50% when compared with the many-core versions, even when it is not so drastical when compared to the CPU version.
Keywords
computerised tomography; graphics processing units; image reconstruction; medical image processing; multiprocessing systems; CT; CUDA-based GPU versions; FDK algorithm; GPGPU; Intel Xeon Phi; Mangoose++; OpenACC runtime; X-ray computed tomography; heterogeneous accelerated computing systems; high-performance X-ray tomography reconstruction algorithm; medical image processing; multicore CPU; portability checking; programmability; resource utilization; source code; Computed tomography; Graphics processing units; Image reconstruction; Instruction sets; Optimization; Parallel processing; Programming;
fLanguage
English
Publisher
ieee
Conference_Titel
Cluster Computing (CLUSTER), 2014 IEEE International Conference on
Conference_Location
Madrid
Type
conf
DOI
10.1109/CLUSTER.2014.6968781
Filename
6968781
Link To Document