DocumentCode :
3380781
Title :
A novel counter-based low complexity inner-product architecture for high speed inputs
Author :
Meher, Manas Ranjan ; Jong, Ching-Chuen ; Chang, Chip-Hong ; Low, Jeremy Yung Shern
Author_Institution :
Integrated Syst. Res. Lab., Nanyang Technol. Univ., Singapore, Singapore
fYear :
2010
fDate :
May 30 2010-June 2 2010
Firstpage :
705
Lastpage :
708
Abstract :
This paper presents a new methodology of multiplierless implementation of inner-product computation. The inner-product computation is decomposed to form an architecture that facilitates an efficient serial accumulation of the 1´s in the partial product matrix of each multiplication of a pair of elements from the input vectors. The 1´s that appear at each partial product position are accumulated by a serial D flip flop (DFF) based 1´s counter. This accumulation stage reduces the column height of the partial product matrix by transforming L vertical bits to ⌊log2 L⌋ + 1 horizontal bits at each coordinate of the partial product matrix. The counter outputs are further summed using carry save addition based on Dadda´s reduction algorithm followed by final carry propagating addition to obtain the final inner-product result. As the counter can operate at a frequency of 2 GHz when implemented on TSMC 0.18 μm CMOS process, the accumulation time is reduced significantly. With simpler partial product reduction tree, the hardware complexity is substantially reduced while the throughput is maintained to be comparable or better than many existing parallel inner-product computation architectures. The synthesis results of the proposed inner-product architecture for various inner-product lengths show that it has lower area-delay-product than many existing architectures.
Keywords :
CMOS digital integrated circuits; UHF integrated circuits; circuit complexity; computer architecture; counting circuits; flip-flops; 1´s counter; Dadda reduction algorithm; TSMC CMOS process; counter-based low complexity inner-product architecture; frequency 2 GHz; hardware complexity; low area-delay-product; parallel inner-product computation architectures; partial product matrix; partial product reduction tree; serial D flip flop; serial accumulation; size 0.18 mum; Computer architecture; Concurrent computing; Counting circuits; Delay; Digital signal processing; Frequency; Hardware; Signal processing algorithms; Telecommunication computing; Throughput;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Circuits and Systems (ISCAS), Proceedings of 2010 IEEE International Symposium on
Conference_Location :
Paris
Print_ISBN :
978-1-4244-5308-5
Electronic_ISBN :
978-1-4244-5309-2
Type :
conf
DOI :
10.1109/ISCAS.2010.5537482
Filename :
5537482
Link To Document :
بازگشت