DocumentCode :
1377630
Title :
MRTP: Mobile Ray Tracing Processor With Reconfigurable Stream Multi-Processors for High Datapath Utilization
Author :
Kim, Hong-Yun ; Kim, Young-Jun ; Kim, Lee-Sup
Author_Institution :
Sch. of Electr. Eng. & Comput. Sci., Korea Adv. Inst. of Sci. & Technol. (KAIST), Daejeon, South Korea
Volume :
47
Issue :
2
fYear :
2012
Firstpage :
518
Lastpage :
535
Abstract :
This paper presents a mobile ray tracing processor (MRTP) with reconfigurable stream multi-processors (RSMPs) for high datapath utilization. The MRTP includes three RSMPs that operate in multiple instruction multiple data (MIMD) mode asynchronously to exploit instruction-level parallelism. Each RSMP is based on single instruction multiple thread (SIMT) architecture to exploit thread-level parallelism. An RSMP consists of twelve scalar processing elements (SPEs) that run multiple threads in parallel synchronously: twelve scalar threads or four vector threads depending on an operating mode. A low datapath utilization caused by a branch divergence in SIMT architecture is improved by 19.9% on average by reconfiguring twelve SPEs between scalar SIMT and vector SIMT with 0.1% area overheads. Special function instructions occupy only 2% ~ 8% of kernel instructions so that a partial special function unit (PSFU) is implemented instead of a large dedicated SFU. The access conflicts with a look-up table (LUT) caused by concurrent accesses of twelve SPEs are reduced by a table loader (TBLD). The TBLD monitors concurrent requests from twelve SPEs and reduces an access count to LUT by distributing a coefficient to multiple SPEs with only one read-access to LUT. MRTP with area of 4 × 4 mm2 has been fabricated in 0.13 μm CMOS technology. MRTP achieves a peak performance of 673 K rays per second while consuming 156 mW at 100 MHz with VDD = 1.2 V .
Keywords :
CMOS digital integrated circuits; graphics processing units; multi-threading; multiprocessing systems; ray tracing; reconfigurable architectures; table lookup; CMOS technology; LUT; MIMD mode; MRTP; PSFU; RSMP; SIMT architecture; TBLD; high datapath utilization; instruction-level parallelism; lookup table; mobile ray tracing processor; multiple instruction multiple data; multiple threads; partial special function unit; power 156 mW; reconfigurable stream multiprocessors; scalar processing elements; single instruction multiple thread architecture; size 0.13 mum; table loader; thread-level parallelism; voltage 1.2 V; Computer architecture; Instruction sets; Kernel; Parallel processing; Ray tracing; Rendering (computer graphics); Three dimensional displays; 3D graphics; Many-core system; SIMD; mobile processor; ray tracing;
fLanguage :
English
Journal_Title :
Solid-State Circuits, IEEE Journal of
Publisher :
ieee
ISSN :
0018-9200
Type :
jour
DOI :
10.1109/JSSC.2011.2171417
Filename :
6082416
Link To Document :
بازگشت