DocumentCode :
967244
Title :
Parallel and Pipelined Architectures for Cyclic Convolution by Block Circulant Formulation Using Low-Complexity Short-Length Algorithms
Author :
Meher, Pramod Kumar
Author_Institution :
Sch. of Comput. Eng., Nanyang Technol. Univ., Singapore
Volume :
18
Issue :
10
fYear :
2008
Firstpage :
1422
Lastpage :
1431
Abstract :
Fully pipelined parallel architectures are derived for high-throughput and reduced-hardware realization of prime-factor cyclic convolution using hardware-efficient modules for short-length rectangular transform (RT). Moreover, a new approach is proposed for the computation of block pseudocyclic convolution using a block cyclic convolution of equal length along with some correction terms, so that the block pseudocyclic representation of cyclic convolution for non-prime-factor-length (N=rP , when r and P are not mutually prime) could be computed efficiently using the algorithms and architectures of short-length cyclic convolutions. Low-complexity algorithms are derived for efficient computation of those error terms, and overall complexities of the proposed technique are estimated for r=2, 3, 4, 6, 8 and 9. The proposed algorithms are used further to design high-throughput and reduced-hardware structures for cyclic convolution where the cofactors are not relatively prime. The proposed structures for high-throughput implementation are found to offer a reduction of nearly 50%-75% of area-delay product over the existing structures for several convolution-lengths. Low-complexity structures for input/output addition units of short length convolutions are derived and used them along with high-throughput modules for hardware-efficient realization of multifactor convolution, which offers nearly 25%-75% reduction of area-delay complexity over the existing structures for various non-prime-factor length convolutions.
Keywords :
FIR filters; VLSI; computational complexity; discrete Fourier transforms; discrete cosine transforms; video signal processing; FIR filtering; VLSI; block circulant formulation; block pseudo-cyclic convolution; block pseudocyclic representation; discrete Fourier transform; discrete cosine transform; discrete sine transform; hardware-efficient modules; high-throughput implementation; low-complexity short-length algorithms; non-prime-factor-length; pipelined parallel architectures; prime-factor cyclic convolution; reduced-hardware realization; short-length rectangular transform; very large-scale integration; Computer architecture; Concurrent computing; Convolution; Discrete Fourier transforms; Discrete cosine transforms; Fast Fourier transforms; Finite impulse response filter; Hardware; Iterative algorithms; Very large scale integration; Block-cyclic convolution; cyclic convolution; pseudocyclic convolution; systolic array; very large-scale integration (VLSI);
fLanguage :
English
Journal_Title :
Circuits and Systems for Video Technology, IEEE Transactions on
Publisher :
ieee
ISSN :
1051-8215
Type :
jour
DOI :
10.1109/TCSVT.2008.2004918
Filename :
4660373
Link To Document :
بازگشت