High Performance GPU Accelerated Local Optimization in TSP

Author

Rocki, Kamil ; Suda, Ryutaro

Author_Institution

Dept. of Comput. Sci., Univ. of Tokyo, Tokyo, Japan

fYear

2013

fDate

20-24 May 2013

Firstpage

1788

Lastpage

1796

Abstract

This paper presents a high performance GPU accelerated implementation of 2-opt local search algorithm for the Traveling Salesman Problem (TSP). GPU usage significantly decreases the execution time needed for tour optimization, however it also requires a complicated and well tuned implementation. With the problem size growing, the time spent on local optimization comparing the graph edges grows significantly. According to our results based on the instances from the TSPLIB library, the time needed to perform a simple local search operation can be decreased approximately 5 to 45 times compared to a corresponding parallel CPU code implementation using 6 cores. The code has been implemented in OpenCL and as well as in CUDA and tested on AMD and NVIDIA devices. The experimental studies show that the optimization algorithm using the GPU local search converges from up to 300 times faster compared to the sequential CPU version on average, depending on the problem size. The main contributions of this paper are the problem division scheme exploiting data locality which allows to solve arbitrarily big problem instances using GPU and the parallel implementation of the algorithm itself.

Keywords

graphics processing units; optimisation; parallel processing; search problems; travelling salesman problems; 2-opt local search algorithm; AMD; CUDA; GPU local search; NVIDIA devices; OpenCL; TSP; TSPLIB library; data locality; high performance GPU accelerated implementation; high performance GPU accelerated local optimization; local optimization; parallel CPU code implementatio; problem division scheme; simple local search operation; tour optimization; traveling salesman problem; Approximation algorithms; Arrays; Cities and towns; Graphics processing units; Heuristic algorithms; Instruction sets; Optimization; GPU Computing; Optimal Scheduling; Optimization; Parallel Architectures;

fLanguage

English

Publisher

ieee

Conference_Titel

Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th International

Conference_Location

Cambridge, MA

Print_ISBN

978-0-7695-4979-8

Type

conf

DOI

10.1109/IPDPSW.2013.227

Filename

6651079