Title :
A 275mW heterogeneous multimedia processor for IC-stacking on Si-interposer
Author :
Kim, Hyo-Eun ; Yoon, Jae-Sung ; Hwang, Kyu-Dong ; Kim, Young-Jun ; Park, Jun-Seok ; Kim, Lee-Sup
Author_Institution :
KAIST, Daejeon, South Korea
Abstract :
Most data-intensive operations for multimedia applications such as image processing, vision, and 3D graphics require high external memory bandwidth. In augmented-reality (AR) processors [1], both 3D graphics and vision operations are required, so memory bandwidth becomes even more critical. In [1], however, memory bandwidth is not considered, floating-point processing is not supported, and there is no cache memory for texturing, which is a performance bottleneck of common graphics pipelines. In this work, a heterogeneous multimedia processor is presented to process various mobile multimedia applications in a single chip on Si-interposer for high memory bandwidth. The implemented processor has 4 key features: (1) A transceiver pool (TRx) that reconfigures strength of output drivers according to the channel loss for IC-stacking on Si interposer, (2) A mode-configurable vector processing unit (MCVPU) for frame level parallelism, (3) An energy-efficient unified filtering unit (UFU) with adaptive block selection (ABS) algorithm for memory-access-efficient texturing, and (4) a unified shader (US) with floating-point scalar processing elements (SPE) and partial special function units (PSFU) to enhance graphics processing perform ance and quality. With these techniques, we achieve 1.7χ frame rate and 8χ memory bandwidth improvement in full AR operation.
Keywords :
augmented reality; cache storage; chip scale packaging; computer graphics; coprocessors; driver circuits; electronic engineering computing; floating point arithmetic; multimedia systems; pipeline processing; silicon; transceivers; vector processor systems; 3D graphics operations; 3D vision operations; ABS algorithm; AR processors; IC-stacking; MCVPU; PSFU; SPE; UFU; adaptive block selection algorithm; augmented-reality processors; cache memory; channel loss; data-intensive operations; energy-efficient unified filtering unit; external memory bandwidth; floating-point processing; floating-point scalar processing elements; frame level parallelism; graphics pipelines; graphics processing performance; graphics processing quality; heterogeneous multimedia processor; memory-access-efficient texturing; mobile multimedia applications; mode-configurable vector processing unit; output drivers; partial special function units; power 275 mW; silicon-interposer; transceiver pool; unified shader; Bandwidth; Energy efficiency; Filtering; Graphics; Multimedia communication; Pipeline processing; Three dimensional displays;
Conference_Titel :
Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2011 IEEE International
Conference_Location :
San Francisco, CA
Print_ISBN :
978-1-61284-303-2
DOI :
10.1109/ISSCC.2011.5746249