DocumentCode :
2088460
Title :
Research on Equal Symmetric Distributed Fault-tolerant Architecture and Strategy for Parallel Satellite System
Author :
Zhou, Hao ; Jiang, Jingfei
Author_Institution :
Coll. of Comput. Sci., Nat. Univ. of Defense Technol., Changsha, China
fYear :
2011
fDate :
24-26 Aug. 2011
Firstpage :
121
Lastpage :
126
Abstract :
Currently, using commercial off-the-shelf (COTS) components and constructing parallel system becomes the main way to improve satellite system performance remarkably. System reliability relying on its structure rather than devices makes system fault-tolerant architecture and strategy increasingly important. As centralized fault-tolerant solution has the inherent drawback of single point of failure (SPOF), it can not guarantee system reliability well. Distributed fault tolerant method, due to its complexity and large overhead, has not been maturely used. In this paper, we propose a new distributed fault-tolerant scheme with high reliability and flexibility, in which an equal symmetric distributed fault tolerant (ESDFt) architecture is constructed. Based on the structure, functional module framework of system nodes and autonomous fault-tolerant strategy are designed. The ESDFt scheme creates favorable conditions for flexible implementation of system fault tolerance. It utilizes the master election mechanism, which eliminates the risks of SPOF and chooses master node intelligently based on user´s configuration, making fault-tolerant complexity and processing overhead flexibly adjustable. The verification and evaluation of ESDFt scheme has been performed on a prototype system. The experiment results prove that ESDFt scheme not only ensures system reliability well, but also enhances system flexibility and practicality at the same time.
Keywords :
artificial satellites; fault tolerant computing; parallel processing; program verification; ESDFt scheme; commercial off-the-shelf components; equal symmetric distributed fault tolerant architecture; master election mechanism; master node; parallel satellite system performance; single point of failure; system reliability; Computer architecture; Fault tolerance; Fault tolerant systems; Monitoring; Nominations and elections; Satellites;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Science and Engineering (CSE), 2011 IEEE 14th International Conference on
Conference_Location :
Dalian, Liaoning
Print_ISBN :
978-1-4577-0974-6
Type :
conf
DOI :
10.1109/CSE.2011.33
Filename :
6062862
Link To Document :
بازگشت