Title :
GPU Implementation of Orthogonal Matching Pursuit for Compressive Sensing
Author :
Fang, Yong ; Chen, Liang ; Wu, Jiaji ; Huang, Bormin
Author_Institution :
Coll. of Inf. Eng., Northwest A&F Univ., Yangling, China
Abstract :
Recovery algorithms play a key role in compressive sampling (CS). Currently, a popular recovery algorithm for CS is the orthogonal matching pursuit (OMP), which possesses the merits of low complexity and good recovery quality. Considering that the OMP involves massive matrix/vector operations, it is very suited to being implemented in parallel on graphics processing unit (GPU). In this paper, we first analyze the complexity of each module in the OMP and point out the bottlenecks of the OMP lie in the projection module and the least-squares module. To speedup the projection module, Fujimoto´s matrix-vector multiplication algorithm is adopted. To speedup the least-squares module, the matrix-inverse-update algorithm is adopted. Experimental results show that +40x speedup is achieved by our implementation of OMP on GTX480 GPU over on Intel(R) Core(TM) i7 CPU. Since the projection module occupies more than 2/3 of the total run time, we are looking for a faster matrix-vector multiplication algorithm.
Keywords :
compressed sensing; graphics processing units; iterative methods; least mean squares methods; matrix multiplication; vectors; GPU implementation; OMP; compressive sensing; graphics processing unit; least-squares module; matrix-inverse-update algorithm; matrix-vector multiplication algorithm; orthogonal matching pursuit; projection module; Complexity theory; Graphics processing unit; Instruction sets; Kernel; Matching pursuit algorithms; Registers; Vectors; compressive sampling; graphics processing unit; orthogonal matching pursuit; recovery algorithm;
Conference_Titel :
Parallel and Distributed Systems (ICPADS), 2011 IEEE 17th International Conference on
Conference_Location :
Tainan
Print_ISBN :
978-1-4577-1875-5
DOI :
10.1109/ICPADS.2011.158