DocumentCode :
2191016
Title :
Introspection-Based Fault Tolerance for COTS-Based High-Capability Computation in Space
Author :
James, Mark L. ; Shapiro, Andrew A. ; Springer, Paul L. ; Zima, Hans P.
Author_Institution :
Jet Propulsion Lab., California Inst. of Technol., Pasadena, CA, USA
fYear :
2008
fDate :
21-23 Jan. 2008
Firstpage :
74
Lastpage :
83
Abstract :
Future missions of deep space exploration face the challenge of designing, building,and operating progressively more capable autonomous spacecraft and planetary rovers. Given the communication latencies and bandwidth limitations for such missions, the need for increased autonomy becomes mandatory, along with the requirement for enhanced on-board computational capabilities while in deep space or time-critical situations. This will result in dramatic changes in the way missions will be conducted and supported by on-board computing systems. Specifically, the traditional approach of relying exclusively on radiation-hardened hardware and modular redundancy will not be able to deliver the required computational power. As a consequence, such systems are expected to include high-capability low-power components based on emerging Commercial-Off-The-Shelf (COTS) multi-core technology. This paper describes the design of a generic framework for introspection that supports runtime monitoring and analysis of program execution as well as a feedback-oriented recovery from faults. One of the first applications of this framework will be to provide flexible software fault tolerance matched to the requirements and properties of applications by exploiting knowledge that is either contained in an application knowledge base, provided by users, or automatically derived from specifications. A prototype implementation is currently in progress at the Jet Propulsion Laboratory, California Institute of Technology, targeting a cluster of Cell Broadband Engines.
Keywords :
fault tolerance; multiprocessing systems; software packages; space vehicles; COTS; autonomous spacecraft; commercial-off-the-shelf multi-core technology; deep space exploration; feedback-oriented recovery; flexible software fault tolerance; high-capability computation; introspection-based fault tolerance; planetary rovers; runtime monitoring; Application software; Bandwidth; Delay; Fault tolerance; Hardware; Redundancy; Space exploration; Space missions; Space vehicles; Time factors; fault tolerance; high-performance computing; introspection; space-borne computing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Innovative Architecture for Future Generation High-Performance Processors and Systems (IWIA), 2008 International Workshop on
Conference_Location :
Hilo, HI
ISSN :
1537-3223
Print_ISBN :
978-1-4244-6465-4
Electronic_ISBN :
1537-3223
Type :
conf
DOI :
10.1109/IWIA.2008.11
Filename :
5453556
Link To Document :
بازگشت