مرکز منطقه ای اطلاع رساني علوم و فناوري - A Time Optimal Parallel Algorithm for the Dynamic Programming on the Hierarchical Memory Machine

DocumentCode :

3571223

Title :

A Time Optimal Parallel Algorithm for the Dynamic Programming on the Hierarchical Memory Machine

Author :

Nakano, Koji

Author_Institution :

Dept. of Inf. Eng., Hiroshima Univ., Higashi-Hiroshima, Japan

fYear :

2014

Firstpage :

Lastpage :

Abstract :

The Hierarchical Memory Machine (HMM) is a theoretical parallel computing model that captures the essence of architecture of CUDA-enabled GPUs. The main contribution of this paper is to present an efficient implementation of the O (n3) time dynamic programming algorithm for solving the optimal triangulation problem for a convex n-gon in the HMM. Although the HMM can run a lot of threads in parallel, it is very hard to accelerate computation involving complicated memory access such as the dynamic programming for the optimal triangulation problem. It is often the case that the acceleration rate is limited to the bandwidth w of the global memory for problems involving complicated stride memory access. Quite surprisingly, our implementation of the dynamic programming algorithm for solving the optimal triangulation problem runs O (n³/w²) time units using max (wL, w²l) threads on the HMM with bandwidth w, global memory latency L and shared memory latency l. Hence, this parallel algorithm achieves the acceleration rate of more than w although the dynamic programming algorithm involves complicated stride memory access. Also, we prove that this parallel algorithm is time optimal when L = O (wl).

Keywords :

dynamic programming; graphics processing units; parallel algorithms; parallel architectures; shared memory systems; CUDA enabled GPU; HMM; global memory latency; hierarchical memory machine; optimal triangulation problem; parallel computing model; shared memory latency; time dynamic programming algorithm; time optimal parallel algorithm; Bandwidth; Dynamic programming; Graphics processing units; Heuristic algorithms; Hidden Markov models; Instruction sets; Optimized production technology; CUDA; Dynamic programming; GPU; memory machine models; parallel algorithms;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Computing and Networking (CANDAR), 2014 Second International Symposium on

Type :

conf

DOI :

10.1109/CANDAR.2014.14

Filename :

7052167

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3571223