DocumentCode :
1655300
Title :
P2012: Building an ecosystem for a scalable, modular and high-efficiency embedded computing accelerator
Author :
Benini, Luca ; Flamand, Eric ; Fuin, Didier ; Melpignano, Diego
Author_Institution :
STMicroelectron., Grenoble, France
fYear :
2012
Firstpage :
983
Lastpage :
987
Abstract :
P2012 is an area- and power-efficient many-core computing fabric based on multiple globally asynchronous, locally synchronous (GALS) clusters supporting aggressive fine-grained power, reliability and variability management. Clusters feature up to 16 processors and one control processor with independent instruction streams sharing a multi-banked L1 data memory, a multi-channel DMA engine, and specialized hardware for synchronization and scheduling. P2012 achieves extreme area and energy efficiency by supporting domain-specific acceleration at the processor and cluster level through the addition of dedicated HW IPs. P2012 can run standard OpenCL and OpenMP parallel codes well as proprietary Native Programming Model (NPM) SW components that provide the highest level of control on application-to-resource mapping. In Q3 2011 the P2012 SW Development Kit (SDK) has been made available to a community of R&D users; it includes full OpenCL and NPM development environments. The first P2012 SoC prototype in 28nm CMOS will sample in Q4 2012, featuring four clusters and delivering 80GOPS (with single precision floating point support) in 15.2mm2 with 2W power consumption.
Keywords :
CMOS memory circuits; file organisation; integrated circuit reliability; parallel architectures; power aware computing; shared memory systems; synchronisation; system-on-chip; HW IPs; NPM development environments; OpenCL parallel codes; OpenMP parallel codes; P2012 SW development kit; P2012 SoC prototype; Q3 2011; Q4 2012; R&D users; SW components; aggressive fine-grained power management; application-to-resource mapping; area-efficient many-core computing fabric; control processor; domain-specific acceleration; energy efficiency; globally asynchronous locally synchronous clusters; high-efficiency embedded computing accelerator; independent instruction streams; modular embedded computing accelerator; multibanked L1 data memory sharing; multichannel DMA engine; native programming model; power consumption; power-efficient many-core computing fabric; processors scheduling; reliability management; size 28 nm; synchronization; variability management; Computer architecture; Fabrics; Hardware; Program processors; Programming; System-on-a-chip;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Design, Automation & Test in Europe Conference & Exhibition (DATE), 2012
Conference_Location :
Dresden
ISSN :
1530-1591
Print_ISBN :
978-1-4577-2145-8
Type :
conf
DOI :
10.1109/DATE.2012.6176639
Filename :
6176639
Link To Document :
بازگشت