مرکز منطقه ای اطلاع رساني علوم و فناوري - <formula formulatype="inline"> <img src="/images/tex/20059.gif" alt="{\\rm SPICE}^2"> </formula>: Spatial Processors Interconnected for Concurrent Execution for Accelerating the SPICE Circuit Simulator Using an FPGA

DocumentCode :

1401105

Title :

${\\rm SPICE}^2$ : Spatial Processors Interconnected for Concurrent Execution for Accelerating the SPICE Circuit Simulator Using an FPGA

Author :

Kapre, Nachiket ; DeHon, André

Author_Institution :

Dept. of Electr. & Electron. Eng., Imperial Coll. London, London, UK

Volume :

Issue :

fYear :

2012

Firstpage :

Lastpage :

Abstract :

Spatial processing of sparse, irregular, double-precision floating-point computation using a single field-programmable gate array (FPGA) enables up to an order of magnitude speedup (mean 2.8× speedup) over a conventional microprocessor for the SPICE circuit simulator. We develop a parallel, FPGA-based, heterogeneous architecture customized for accelerating the SPICE simulator to deliver this speedup. To properly parallelize the complete simulator, we decompose SPICE into its three constituent phases-model evaluation, sparse matrix-solve, and iteration control-and customize a spatial architecture for each phase independently. Our heterogeneous FPGA organization mixes very large instruction word, dataflow and streaming architectures into a cohesive, unified design to match the parallel patterns exposed by our programming framework. This FPGA architecture is able to outperform conventional processors due to a combination of factors, including high utilization of statically-scheduled resources, low-overhead dataflow scheduling of fine-grained tasks, and streaming, overlapped processing of the control algorithms. We demonstrate that we can independently accelerate model evaluation by a mean factor of 6.5 × (1.4-23×) across a range of nonlinear device models and matrix solve by 2.4×(0.6-13×) across various benchmark matrices while delivering a mean combined speedup of 2.8×(0.2-11×) for the composite design when comparing a Xilinx Virtex-6 LX760 (40 nm) with an Intel Core i7 965 (45 nm). We also estimate mean energy savings of 8.9× (up to 40.9×) when comparing a Xilinx Virtex-6 LX760 with an Intel Core i7 965.

Keywords :

SPICE; circuit simulation; field programmable gate arrays; floating point arithmetic; integrated circuit interconnections; microprocessor chips; sparse matrices; FPGA; Intel Core i7 965; SPICE circuit simulator acceleration; Xilinx Virtex-6 LX760; benchmark matrix; dataflow architecture; floating-point computation; heterogeneous architecture; iteration control phasing; low-overhead dataflow scheduling; mean energy saving estimation; microprocessor; model evaluation phasing; nonlinear device model; size 40 nm; size 45 nm; sparse matrix-solve phasing; spatial processor interconnected for concurrent execution; statically-scheduled resource utilization; streaming architecture; Computational modeling; Field programmable gate arrays; Integrated circuit modeling; Mathematical model; Runtime; SPICE; Sparse matrices; Parallelism; reconfigurable logic; simulation;

fLanguage :

English

Journal_Title :

Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on

Publisher :

ieee

ISSN :

0278-0070

Type :

jour

DOI :

10.1109/TCAD.2011.2173199

Filename :

6106733

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1401105