DocumentCode
1999252
Title
High Performance GPU Accelerated Local Optimization in TSP
Author
Rocki, Kamil ; Suda, Ryutaro
Author_Institution
Dept. of Comput. Sci., Univ. of Tokyo, Tokyo, Japan
fYear
2013
fDate
20-24 May 2013
Firstpage
1788
Lastpage
1796
Abstract
This paper presents a high performance GPU accelerated implementation of 2-opt local search algorithm for the Traveling Salesman Problem (TSP). GPU usage significantly decreases the execution time needed for tour optimization, however it also requires a complicated and well tuned implementation. With the problem size growing, the time spent on local optimization comparing the graph edges grows significantly. According to our results based on the instances from the TSPLIB library, the time needed to perform a simple local search operation can be decreased approximately 5 to 45 times compared to a corresponding parallel CPU code implementation using 6 cores. The code has been implemented in OpenCL and as well as in CUDA and tested on AMD and NVIDIA devices. The experimental studies show that the optimization algorithm using the GPU local search converges from up to 300 times faster compared to the sequential CPU version on average, depending on the problem size. The main contributions of this paper are the problem division scheme exploiting data locality which allows to solve arbitrarily big problem instances using GPU and the parallel implementation of the algorithm itself.
Keywords
graphics processing units; optimisation; parallel processing; search problems; travelling salesman problems; 2-opt local search algorithm; AMD; CUDA; GPU local search; NVIDIA devices; OpenCL; TSP; TSPLIB library; data locality; high performance GPU accelerated implementation; high performance GPU accelerated local optimization; local optimization; parallel CPU code implementatio; problem division scheme; simple local search operation; tour optimization; traveling salesman problem; Approximation algorithms; Arrays; Cities and towns; Graphics processing units; Heuristic algorithms; Instruction sets; Optimization; GPU Computing; Optimal Scheduling; Optimization; Parallel Architectures;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th International
Conference_Location
Cambridge, MA
Print_ISBN
978-0-7695-4979-8
Type
conf
DOI
10.1109/IPDPSW.2013.227
Filename
6651079
Link To Document