Author_Institution :
Dept. of Inf. Eng., Hiroshima Univ., Higashi-Hiroshima, Japan
Abstract :
The Conway´s Game of Life is the most well-known cellular automaton. The universe of the Game of Life is a 2-dimensional array of cells, each of which takes two possible states, alive or dead. The state of every cell is repeatedly updated according to those of eight neighbors. A cell will be alive if exactly three neighbors are alive, or if it is alive and two or three neighbors are alive. The main contribution of this paper is to develop several acceleration techniques for simulating the Game of Life. The key techniques for the simulation is to store a block of cells in registers of 32 threads in a warp of a CUDA block and to perform multiple-step simulation. We use a warp shuffle instruction, which allows us to exchange data stored in registers of threads in a warp, to transfer the current states stored in registers of other threads necessary to compute the next states. Further, since multiple-step simulation is performed, the number of CUDA kernel calls can be decreased. The experimental results show that, the best configuration of our GPU implementation can perform 1024-step simulation of 16384 × 16384 cells in 0.163 seconds on GeForce GTX TITAN X GPU. The best sequential algorithm using a single core of Intel Xeon X7460 CPU runs 58.3 seconds. Hence, our best GPU implementation has achieved a speed-up factor of 357 over the CPU implementation.