DocumentCode
1816136
Title
A Resource Management System for Fault Tolerance in Grid Computing
Author
Lee, HwaMin ; Park, DooSoon ; Hong, Min ; Yeo, Sang-Soo ; Kim, SooKyun ; Kim, SungHoon
Author_Institution
Div. of Comput. Sci. & Eng., Soonchunhyang Univ., Asan, South Korea
Volume
2
fYear
2009
fDate
29-31 Aug. 2009
Firstpage
609
Lastpage
614
Abstract
In grid computing, resource management and fault tolerance services are important issues. The availability of the selected resources for job execution is a primary factor that determines the computing performance. The failure occurrence of resources in the grid computing is higher than in a tradition parallel computing. Since the failure of resources affects job execution fatally, fault tolerance service is essential in computational grids. And grid services are often expected to meet some minimum levels of quality of service (QoS) for desirable operation. However Globus toolkit does not provide fault tolerance service that supports fault detection service and management service and satisfies QoS requirement. Thus this paper proposes fault tolerance service to satisfy QoS requirement in computational grids. In order to provide fault tolerance service and satisfy QoS requirements, we expand the definition of failure, such as process failure, processor failure, and network failure. And we propose resource scheduling service, fault detection service and fault management service and show implement and experiment results.
Keywords
fault tolerant computing; grid computing; middleware; quality of service; software development management; Globus toolkit; fault detection service; fault management service; fault tolerance service; grid computing; quality of service; resource management system; resource scheduling service; Computerized monitoring; Condition monitoring; Fault detection; Fault tolerance; Fault tolerant systems; Grid computing; Memory management; Middleware; Quality of service; Resource management;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Science and Engineering, 2009. CSE '09. International Conference on
Conference_Location
Vancouver, BC
Print_ISBN
978-1-4244-5334-4
Electronic_ISBN
978-0-7695-3823-5
Type
conf
DOI
10.1109/CSE.2009.257
Filename
5283868
Link To Document