Title :
CSHFt: A Composite Fault-Tolerant Architecture and Self-Adaptable Hierarchical Fault-Tolerant Strategy for Satellite System
Author :
Zhou, Hao ; Jiang, Jingfei
Author_Institution :
Coll. of Comput. Sci., Nat. Univ. of Defense Technol., Changsha, China
Abstract :
Nowadays, building parallel system with high-performance commercial off-the-shelf (COTS) chips becomes the main way to improve satellite system performance greatly. System reliability is a tough issue which needs to be solved by more effective fault-tolerant scheme. Centralized fault-tolerant scheme has the risk of single point of failure (SPOF), while distributed fault-tolerant scheme is much complex and introduces large overhead. Both of these traditional methods have their own drawbacks. This paper proposes a composite self-adaptable hierarchical fault-tolerant (CSHFt) scheme which effectively integrates and expands the ideas of centralized and distributed fault-tolerant methods. It constructs a composite and symmetrical system architecture supporting both fault-tolerant methods simultaneously and realizes the self-adaptable hierarchical fault-tolerant strategy. CSHFt scheme executes system fault tolerance in the sequence of ´first centralized then distributed´. System switches its fault-tolerant mode actively according to its performing history. By combining and completing two traditional methods, CSHFt scheme eliminates the risk of SPOF, reduces the overall complexity and overhead of system fault tolerance and enhances system´s real-time feature. System failure only occurs when all the system nodes are broken, which makes system highly reliable. Based on the prototype system, we verify the practical CSHFt scheme. Its performance is also analyzed and evaluated.
Keywords :
aerospace computing; aircraft maintenance; artificial satellites; fault tolerant computing; parallel architectures; performance evaluation; CSHFt; CSHFt scheme; SPOF; centralized fault tolerant method; composite fault tolerant architecture; composite system architecture; distributed fault tolerant method; high performance commercial off the shelf chips; parallel system; satellite system performance; self-adaptable hierarchical fault tolerant strategy; single point of failure; symmetrical system architecture; system reliability; Complexity theory; Computer architecture; Fault tolerance; Fault tolerant systems; Nominations and elections; Satellites; composite architecture; fault-tolerant architecture; fault-tolerant strategy; hierarchical strategy; satellite system; self-adaptable;
Conference_Titel :
Distributed Computing and Applications to Business, Engineering and Science (DCABES), 2011 Tenth International Symposium on
Conference_Location :
Wuxi
Print_ISBN :
978-1-4577-0327-0
DOI :
10.1109/DCABES.2011.35