مرکز منطقه ای اطلاع رساني علوم و فناوري - Component Based Proactive Fault Tolerant Scheduling in Computational Grid

DocumentCode :

3384532

Title :

Component Based Proactive Fault Tolerant Scheduling in Computational Grid

Author :

Haider, Sajjad ; Imran, Muhammad ; Niaz, Iftikhar Azim ; Ullah, Saeed ; Ansari, M.A.

Author_Institution :

Inf. Technol. Dept., Nat. Univ. of Modern Languages, Islamabad

fYear :

2007

fDate :

12-13 Nov. 2007

Firstpage :

119

Lastpage :

124

Abstract :

Computational Grids have the capability to provide the main execution platform for high performance distributed applications. Grid resources having heterogeneous architectures, being geographically distributed and interconnected via unreliable network media are extremely complex and prone to different kinds of errors, failures and faults. Grid is a layered architecture and most of the fault tolerant techniques developed on grids use its strict layering approach. In this paper, we have proposed a cross-layer design for handling faults proactively. In a cross-layer design, the top- down and bottom-up approach is not strictly followed, and a middle layer can communicate with the layer below or above it [1]. At each grid layer there would be a monitoring component that would decide on predefined factors that the reliability of that particular layer is high, medium or low. Based on Hardware Reliability Rating (HRR) and Software Reliability Rating (SRR), the Middleware Monitoring Component / Cross- Layered Component (MMC/CLC) would generate a Combined Rating (CR) using CR calculation matrix rules. Each grid participating node will have a CR value generated through cross layered communication using HMC, MMC/CLC and SMC. All grid nodes will have their CR information in the form of a CR table and high rated machines would be selected for job execution on the basis of minimum CPU load along with different intensities of check pointing. Handling faults proactively at each layer of grid using cross communication model would result in overall improved dependability and increased performance with less overheads of check pointing.

Keywords :

checkpointing; error handling; fault tolerant computing; grid computing; middleware; object-oriented programming; scheduling; system monitoring; check pointing; combined rating table; computational grid; cross-layered component; hardware reliability rating; high performance distributed application; matrix rule; middleware monitoring component; proactive fault tolerant scheduling; software reliability rating; Chromium; Computer architecture; Cross layer design; Distributed computing; Fault tolerance; Grid computing; High performance computing; Monitoring; Nonhomogeneous media; Processor scheduling;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Emerging Technologies, 2007. ICET 2007. International Conference on

Conference_Location :

Islamabad

Print_ISBN :

978-1-4244-1493-2

Electronic_ISBN :

978-1-4244-1494-9

Type :

conf

DOI :

10.1109/ICET.2007.4516328

Filename :

4516328

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3384532