Title :
Detailed radiation fault modeling of the Remote Exploration and Experimentation (REE) first generation testbed architecture
Author :
Beahan, John ; Edmonds, Larry ; Ferraro, Robert D. ; Johnston, Allan ; Katz, Daniel S. ; Some, Raphael R.
Author_Institution :
Jet Propulsion Lab., California Inst. of Technol., Pasadena, CA, USA
Abstract :
The goal of the NASA HPCC Remote Exploration and Experimentation (REE) Project is to transfer commercial supercomputing technology into space. The project will use state of the art, low-power, non-radiation-hardened, COTS hardware chips and COTS software to the maximum extent possible, and will rely on software-implemented fault tolerance to provide the required levels of availability and reliability. We outline the methodology used to develop a detailed radiation fault model for the REE Testbed architecture. The model addresses the effects of energetic protons and heavy ions which cause single event upset and single event multiple upset events in digital logic devices and which are expected to be the primary fault generation mechanism. Unlike previous modeling efforts, this model will address fault rates and types in computer subsystems at a sufficiently fine level of granularity (i.e., the register level) that specific software and operational errors can be derived. We present the current state of the model, model verification activities and results to date, and plans for the future. Finally, we explain the methodology by which this model will be used to derive application-level error effects sets. These error effects sets will be used in conjunction with our Testbed fault injection capabilities and our applications´ mission scenarios to replicate the predicted fault environment on our suite of onboard applications
Keywords :
aerospace computing; fault simulation; radiation hardening (electronics); software architecture; software fault tolerance; software tools; space vehicle electronics; COTS hardware chips; COTS software; Remote Exploration and Experimentation Project; SEU events; application-level error effects; commercial supercomputing technology; computer subsystems; digital logic devices; energetic proton effects; fault generation mechanism; fault injection capabilities; first generation testbed architecture; heavy ion effects; model verification; onboard applications; radiation fault modeling; register level; single event multiple upset events; software-implemented fault tolerance; Availability; Computer architecture; Fault tolerance; Hardware; Logic devices; NASA; Protons; Single event upset; Space technology; Testing;
Conference_Titel :
Aerospace Conference Proceedings, 2000 IEEE
Conference_Location :
Big Sky, MT
Print_ISBN :
0-7803-5846-5
DOI :
10.1109/AERO.2000.878499