• DocumentCode
    31017
  • Title

    Modern GPUs Radiation Sensitivity Evaluation and Mitigation Through Duplication With Comparison

  • Author

    Oliveira, Daniel A. G. ; Rech, P. ; Quinn, Heather M. ; Fairbanks, Thomas D. ; Monroe, Laura ; Michalak, Sarah E. ; Anderson-Cook, Christine ; Navaux, Philippe Olivier Alexandre ; Carro, Luigi

  • Author_Institution
    Inst. de Inf., Fed. Univ. of Rio Grande do Sul (UFRGS), Porto Alegre, Brazil
  • Volume
    61
  • Issue
    6
  • fYear
    2014
  • fDate
    Dec. 2014
  • Firstpage
    3115
  • Lastpage
    3122
  • Abstract
    Graphics processing units (GPUs) are increasingly common in both safety-critical and high-performance computing (HPC) applications. Some current supercomputers are composed of thousands of GPUs so the probability of device corruption becomes very high. Moreover, the GPU´s parallel capabilities are very attractive for the automotive and aerospace markets, where reliability is a serious concern. In this paper, the neutron sensitivity of the modern GPU caches, and internal resources are experimentally evaluated. Various Duplication With Comparison strategies to reduce GPU radiation sensitivity are then presented and validated through radiation experiments. Threads should be carefully duplicated to avoid undesired errors on shared resources and to avoid the exacerbation of errors in critical resources such as the scheduler.
  • Keywords
    fault tolerance; graphics processing units; radiation hardening (electronics); GPU; critical resources; graphics processing units; high performance computing applications; neutron sensitivity; radiation sensitivity evaluation; safety critical applications; Fault tolerance; Graphics processing units; Neutrons; Parallel processing; Radiation effects; Reliability; Sensitivity; Fault tolerance; graphics processing unit (GPU); neutron sensitivity; parallel processors; reliability;
  • fLanguage
    English
  • Journal_Title
    Nuclear Science, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9499
  • Type

    jour

  • DOI
    10.1109/TNS.2014.2362014
  • Filename
    6949170