DocumentCode :
2446762
Title :
Designing scalable many-core parallel algorithms for min graphs using CUDA
Author :
Tran, Quoc-Nam
Author_Institution :
Lamar Univ., Beaumont, TX, USA
fYear :
2010
fDate :
19-23 April 2010
Firstpage :
1
Lastpage :
8
Abstract :
Removing redundant edges on a large graph is a fundamental problem in many practical applications such as verification of real-time systems and network routing. In this paper, we present the designs of scalable and efficient parallel algorithms for multiple many-core GPU devices using CUDA. Our algorithms expose substantial fine-grained parallelism while maintaining minimal global communication. By using the global scope of the GPU´s global memory, coalescing the global memory reads and writes, and avoiding on-chip shared memory bank conflicts, we are able to achieve a large performance benefit with a speed-up of 2,500x on a desktop computer in comparison with a single core CPU program. We report our experiments on large graphs with up to 29 K vertices using multiple GPU devices.
Keywords :
computer graphic equipment; coprocessors; graph theory; multiprocessing systems; parallel algorithms; parallel architectures; CUDA; GPU global memory; compute unified device architecture; fine-grained parallelism; min graphs; minimal global communication; multiple many-core GPU devices; redundant edges; scalable many-core parallel algorithm; Algorithm design and analysis; Concurrent computing; Global communication; Parallel algorithms; Parallel processing; Performance gain; Read-write memory; Real time systems; Routing; Yarn; Multi-core Computing; Programming Techniques;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on
Conference_Location :
Atlanta, GA
Print_ISBN :
978-1-4244-6533-0
Type :
conf
DOI :
10.1109/IPDPSW.2010.5470712
Filename :
5470712
Link To Document :
بازگشت