DocumentCode :
3384532
Title :
Component Based Proactive Fault Tolerant Scheduling in Computational Grid
Author :
Haider, Sajjad ; Imran, Muhammad ; Niaz, Iftikhar Azim ; Ullah, Saeed ; Ansari, M.A.
Author_Institution :
Inf. Technol. Dept., Nat. Univ. of Modern Languages, Islamabad
fYear :
2007
fDate :
12-13 Nov. 2007
Firstpage :
119
Lastpage :
124
Abstract :
Computational Grids have the capability to provide the main execution platform for high performance distributed applications. Grid resources having heterogeneous architectures, being geographically distributed and interconnected via unreliable network media are extremely complex and prone to different kinds of errors, failures and faults. Grid is a layered architecture and most of the fault tolerant techniques developed on grids use its strict layering approach. In this paper, we have proposed a cross-layer design for handling faults proactively. In a cross-layer design, the top- down and bottom-up approach is not strictly followed, and a middle layer can communicate with the layer below or above it [1]. At each grid layer there would be a monitoring component that would decide on predefined factors that the reliability of that particular layer is high, medium or low. Based on Hardware Reliability Rating (HRR) and Software Reliability Rating (SRR), the Middleware Monitoring Component / Cross- Layered Component (MMC/CLC) would generate a Combined Rating (CR) using CR calculation matrix rules. Each grid participating node will have a CR value generated through cross layered communication using HMC, MMC/CLC and SMC. All grid nodes will have their CR information in the form of a CR table and high rated machines would be selected for job execution on the basis of minimum CPU load along with different intensities of check pointing. Handling faults proactively at each layer of grid using cross communication model would result in overall improved dependability and increased performance with less overheads of check pointing.
Keywords :
checkpointing; error handling; fault tolerant computing; grid computing; middleware; object-oriented programming; scheduling; system monitoring; check pointing; combined rating table; computational grid; cross-layered component; hardware reliability rating; high performance distributed application; matrix rule; middleware monitoring component; proactive fault tolerant scheduling; software reliability rating; Chromium; Computer architecture; Cross layer design; Distributed computing; Fault tolerance; Grid computing; High performance computing; Monitoring; Nonhomogeneous media; Processor scheduling;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Emerging Technologies, 2007. ICET 2007. International Conference on
Conference_Location :
Islamabad
Print_ISBN :
978-1-4244-1493-2
Electronic_ISBN :
978-1-4244-1494-9
Type :
conf
DOI :
10.1109/ICET.2007.4516328
Filename :
4516328
Link To Document :
بازگشت