DocumentCode :
1995632
Title :
The Hierarchical Memory Machine Model for GPUs
Author :
Nakano, Kaoru
Author_Institution :
Dept. of Inf. Eng., Hiroshima Univ., Higashi-Hiroshima, Japan
fYear :
2013
fDate :
20-24 May 2013
Firstpage :
591
Lastpage :
600
Abstract :
The Discrete Memory Machine (DMM) and the Unified Memory Machine (UMM) are theoretical parallel computing models that capture the essence of the shared memory access and the global memory access of GPUs. The main contribution of this paper is to introduce the Hierarchical Memory Machine (HMM), which consists of multiple DMMs and a single UMM. The HMM is a more practical parallel computing model which reflects the architecture of current GPUs. We present several fundamental algorithms on the HMM. First, we show that the sum of n numbers can be computed in O(n/w + nl/p + l + log n) time units using p threads on the HMM with width ω and latency l, and prove that this computing time is optimal. We also show that the direct convolution of m and m + n - 1 numbers can be done in O(n/w + mn/dw + nl/p + l+ log m) time units using p threads on the HMM with d DMMs, width ω, and latency l. Finally, we prove that our implementation of the direct convolution is time optimal.
Keywords :
computational complexity; graphics processing units; shared memory systems; DMM; GPU; HMM; UMM; computing time; direct convolution; discrete memory machine; global memory access; graphics processing unit; hierarchical memory machine model; parallel computing models; shared memory access; unified memory machine; Computational modeling; Computer architecture; Convolution; Graphics processing units; Hidden Markov models; Instruction sets; CUDA; GPU; convolution; memory machine models; parallel computing models;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th International
Conference_Location :
Cambridge, MA
Print_ISBN :
978-0-7695-4979-8
Type :
conf
DOI :
10.1109/IPDPSW.2013.17
Filename :
6650935
Link To Document :
بازگشت