DocumentCode
2475162
Title
Exploring hardware transaction processing for reliable computing in chip-multiprocessors against soft errors
Author
Zheng, Chuanlei ; Shukla, Parijat ; Wang, Shuai ; Hu, Jie
Author_Institution
Dept. of Comput. Sci. & Technol., Nanjing Univ., Nanjing, China
fYear
2012
fDate
3-5 Oct. 2012
Firstpage
92
Lastpage
97
Abstract
With shrinking transistor feature size, lowering nodal capacitance and supply voltage at new technology generations, microprocessors are becoming more vulnerable to single-event upsets and transients, a.k.a., soft errors. While chip-multiprocessor (CMP) architecture has been employed in mainstream microprocessors and the number of on-chip processor cores keeps increasing, the system-level reliability of chip-multiprocessors is degrading reversely proportional to the core number. In this work, we propose to exploit abundant on-chip processor cores for redundant hardware transaction processing, which provides native support for error detection and recovery in transactional chip-multiprocessors (TxCMPs) against soft errors. The proposed transactional processor cores execute everything as transactions and TxCMPs execute redundant transactions on different cores. To alleviate the performance overhead due to transaction commits, we further propose two architectural optimizations, namely early partial commit packet transmission and speculative transaction execution in reliable computing mode. Our experimental evaluation confirms the effectiveness of our optimized TxCMPs in achieving low cost reliable computing against soft errors.
Keywords
computer architecture; fault tolerant computing; multiprocessing systems; system-on-chip; transaction processing; CMP; TxCMP; architectural optimization; chip-multiprocessor architecture; computing reliability; error detection; error recovery; nodal capacitance; on-chip processor cores; partial commit packet transmission; redundant hardware transaction processing; single-event transients; single-event upsets; soft errors; speculative transaction execution; supply voltage; system-level reliability; transistor feature size; Benchmark testing; Buffer storage; Computer architecture; Fault tolerance; Hardware; Registers; Hardware Transaction Processing; Reliable Computing; Soft Error; Transactional Processor;
fLanguage
English
Publisher
ieee
Conference_Titel
Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT), 2012 IEEE International Symposium on
Conference_Location
Austin, TX
Print_ISBN
978-1-4673-3043-5
Type
conf
DOI
10.1109/DFT.2012.6378206
Filename
6378206
Link To Document