مرکز منطقه ای اطلاع رساني علوم و فناوري - Design and Implementation for GPU-based seamless rate adaptive decoder

DocumentCode :

3574071

Title :

Design and Implementation for GPU-based seamless rate adaptive decoder

Author :

Lu Qiu ; Min Wang ; Jun Wu ; Zhifeng Zhang ; Xinlin Huang

Author_Institution :

Coll. of Electron. & Inf. Eng., Tongji Univ., Shanghai, China

fYear :

2014

Firstpage :

236

Lastpage :

240

Abstract :

Recently, the research on rate adaption at receiver has caused widespread concern. Seamless rate adaptive (SRA) is one of the promising rate adaptation schemes for wireless communication system. However, the high complexity of decoding hinders its application. The graphics processor unit (GPU) is able to provide a low-cost and flexible software-based multi-core architecture for high performance computing. This paper proposes a GPU design and implementation for SRA decoder. Firstly, we discuss the parallelism of SRA decoding algorithm. In order to improve the throughput of the GPU-based SRA decoder, a massive parallel architecture is used in SRA decoder, which consists of N × L parallel threads. Given fully consideration of the hardware architecture of GPU, we partition the block and select the appropriate number of threads within an individual block to further improve the throughput of GPU-based SRA decoder. In addition, we propose an efficient memory-usage mechanism in GPU-based SRA decoder which takes fully advantage of the shared memory in one block. Finally, We implement the SRA decoder on the Compute Unified Device Architecture (CUDA) platform. The GPU-based SRA decoder is flexible for different measurement matrix, and achieves a 60x speedup compared by its single-threaded counterpart performed on central processing unit (CPU).

Keywords :

adaptive decoding; graphics processing units; parallel architectures; shared memory systems; CPU; CUDA; GPU design; GPU-based seamless rate adaptive decoder; SRA decoder; SRA decoding algorithm; central processing unit; compute unified device architecture; graphics processor unit; hardware architecture; massive parallel architecture; memory-usage mechanism; shared memory; Decoding; Field programmable gate arrays; Graphics processing units; Iterative decoding; Message systems; Partitioning algorithms; Throughput; CUDA; GPU; massive parallel computing; seamless rate adaptation;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Communications and Networking in China (CHINACOM), 2014 9th International Conference on

Type :

conf

DOI :

10.1109/CHINACOM.2014.7054292

Filename :

7054292

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3574071