Title :
Hybrid MPI-OpenMP Programming for Parallel OSEM PET Reconstruction
Author :
Jones, M.D. ; Yao, R. ; Bhole, C.P.
Author_Institution :
Center for Computational Res., State Univ. of New York, Buffalo, NY
Abstract :
To improve the parallel efficiency (PE) of the ordered-subsets expectation-maximization (OSEM) algorithm for three-dimensional (3-D) positron emission tomography (PET) image reconstruction, we focused on reducing the computational imbalance among parallel processes and interprocess data exchange time which were the dominant limiting factors of PE when a large number of networked compute nodes were used. As clusters with multiple processors on each compute node have become increasingly common, we have aimed to take advantage of the load-balancing mechanism and the inherently lower latency of shared memory threads across processors within a single node. We, therefore, implemented the OSEM algorithm with a hybrid message passing interface (MPI) and OpenMP approach on the basis of a standard MPI implementation. The contributing components to the total reconstruction time for the hybrid technique were quantified and compared to that using only MPI. The hybrid MPI-OpenMP technique achieved a consistent PE improvement of approximately 7% to 17% compared to the pure MPI approach on the same number of compute nodes. As clusters of larger shared-memory multiprocessor (SMP) machines continue to become more cost effective, we expect this hybrid MPI-OpenMP approach to be increasingly valuable
Keywords :
application program interfaces; expectation-maximisation algorithm; image reconstruction; medical image processing; message passing; parallel processing; positron emission tomography; PET reconstruction; data exchange time; hybrid MPI-OpenMP programming; image reconstruction; load-balancing mechanism; message passing interface; multiple processors; ordered-subsets expectation-maximization algorithm; parallel efficiency; parallel process; shared-memory multiprocessor machines; three-dimensional positron emission tomography; Clustering algorithms; Computer networks; Concurrent computing; Costs; Delay; Image reconstruction; Message passing; Parallel programming; Positron emission tomography; Yarn; Image reconstruction; parallel processing; positron emission tomography (PET);
Journal_Title :
Nuclear Science, IEEE Transactions on
DOI :
10.1109/TNS.2006.882295