DocumentCode :
3118520
Title :
Experimental evaluation of GPUs radiation sensitivity and algorithm-based fault tolerance efficiency
Author :
Rech, P. ; Carro, Luigi
Author_Institution :
Univ. Fed. do Rio Grande do Sul Porto Alegre, Porto Alegre, Brazil
fYear :
2013
fDate :
8-10 July 2013
Firstpage :
244
Lastpage :
247
Abstract :
Experimental results demonstrate that Graphic Processing Units are very prone to be corrupted by neutrons. We have performed several experimental campaigns at ISIS, UK and at LANSCE, Los Alamos, NM, USA accessing the sensitivity of the GPU internal resources as well as the error rate of common parallel algorithms. Experiments highlight output error patterns and radiation responses that can be fruitfully used to design optimized Algorithm-Based Fault Tolerance strategies and provide pragmatic programming guidelines to increase the code reliability with low computational overhead.
Keywords :
computational complexity; error correction codes; fault tolerant computing; graphics processing units; parallel algorithms; performance evaluation; radiation effects; GPU internal resource sensitivity; GPU radiation sensitivity; ISIS UK; LANSCE Los Alamos NM USA; algorithm-based fault tolerance efficiency; code reliability; computational overhead; design optimized algorithm-based fault tolerance strategies; error rate; experimental evaluation; graphic processing units; output error patterns; parallel algorithms; radiation responses; Error correction codes; Graphics processing units; Instruction sets; Neutrons; Parallel processing; Reliability; Sensitivity; GPU; multiple errors; neutron sensitivity; software-based hardening;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
On-Line Testing Symposium (IOLTS), 2013 IEEE 19th International
Conference_Location :
Chania
Type :
conf
DOI :
10.1109/IOLTS.2013.6604091
Filename :
6604091
Link To Document :
بازگشت