Title :
NQueens on CUDA: Optimization Issues
Author :
Frank Feinbube;Bernhard Rabe;Martin von Löwis;Andreas Polze
Author_Institution :
Hasso Plattner Inst., Univ. of Potsdam, Potsdam, Germany
Abstract :
Todays commercial off-the-shelf computer systems are multicore computing systems as a combination of CPU, graphic processor (GPU) and custom devices. In comparison with CPU cores, graphic cards are capable to execute hundreds up to thousands compute units in parallel. To benefit from these GPU computing resources, applications have to be parallelized and adapted to the target architecture. In this paper we show our experience in applying the NQueens puzzle solution on GPUs using Nvidia´s CUDA (Compute Unified Device Architecture) technology. Using the example of memory usage and memory access, we demonstrate that optimizations of CUDA programs may have contrary results on different CUDA architectures. Evaluation results will point out, that it is not sufficient to use new programming languages or compilers to achieve best results with emerging graphic card computing.
Keywords :
"Concurrent computing","Computer architecture","Distributed computing","Multicore processing","Computer graphics","Computer languages","Field programmable gate arrays","Computer applications","Application software","Program processors"
Conference_Titel :
Parallel and Distributed Computing (ISPDC), 2010 Ninth International Symposium on
Print_ISBN :
978-1-4244-7602-2
DOI :
10.1109/ISPDC.2010.22