مرکز منطقه ای اطلاع رساني علوم و فناوري - Algorithm and architecture for a high density, low power scalar product macrocell

DocumentCode :

927300

Title :

Algorithm and architecture for a high density, low power scalar product macrocell

Author :

Gu, J. ; Chang, C.-H. ; Yeo, K.-S.

Author_Institution :

Sch. of Electr. & Electron. Eng., Nanyang Technol. Univ., Singapore, Singapore

Volume :

151

Issue :

fYear :

2004

fDate :

3/19/2004 12:00:00 AM

Firstpage :

161

Lastpage :

172

Abstract :

The authors present a design approach for an arithmetic macrocell that computes the scalar product of two vectors, an operation ubiquitously present in the solution of many communications and digital signal processing problems. The core of the proposed architecture is a full combinational design containing a partial product generator, a partial product accumulator and a vector accumulator. The design addresses the competing optimisation goals of VLSI area, power dissipation and latency in the deep submicron regime. Compared with conventional merged arithmetic architectures, the proposed macrocell design represents a substantial improvement in the VLSI layout with little area wastage, a high degree of regularity and a good scalability for different vector lengths and operand widths. A theoretical analysis shows that the design of a 16-bit scalar product multiplier for input vectors with 16 elements, in comparison with traditionally designed architecture, achieves a saving of 38.6% in the silicon area, an up to 73% increase in the area usage efficiency and a 29.4% saving in the interconnect delay. Post-layout simulations of the proposed circuit, based on a 0.18 μm CMOS process, show an average power dissipation of 64.96 mW and a latency of 6.92 ns at a standard supply voltage of 1.8 V, a superior performance for a single cycle instruction in a high-speed, low voltage 16-bit digital signal processor operating at 144 MHz. The use of shorter interconnects and more equalised interconnect delays, leads to the power dissipation and delay incurred by the interconnects being substantially reduced. Post-layout simulation of our proposed circuit at supply voltages ranging from 0.7 to 3.3 V shows a significant power reduction of 6 to 13% over the pre-layout simulation results of the conventional design.

Keywords :

VLSI; circuit optimisation; circuit simulation; delay estimation; digital arithmetic; digital signal processing chips; integrated circuit design; integrated logic circuits; logic simulation; multiplying circuits; 16-bit scalar product multiplier; CMOS process; VLSI area; arithmetic macrocell design; combinational design; deep submicron regime; digital signal processing problems; interconnect delays; merged arithmetic architectures; optimisation goals; partial product accumulator; partial product generator; postlayout simulations; power dissipation; prelayout simulation; single cycle instruction; supply voltages; vector accumulator;

fLanguage :

English

Journal_Title :

Computers and Digital Techniques, IEE Proceedings -

Publisher :

iet

ISSN :

1350-2387

Type :

jour

DOI :

10.1049/ip-cdt:20040328

Filename :

1274033

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=927300