Title :
Fault tolerance in a mobile agent based computational grid
Author :
Lopes, Rafael Fernandes ; da Silva e Silva, Francisco José
Author_Institution :
Dept. de Informatica, Univ. Fed. do Maranhao, Sao Luis
Abstract :
In recent years, grid computing has emerged as a promising alternative to increase the capacity of processing and storage, through integration and sharing of multi-institutional resources. Fault tolerance is an essential characteristic for grid environments. As the grid acts as a massively parallel system, the loss of computation time must be avoided. In fact, the likelihood of errors occurring may be exacerbated by the fact that many grid applications will perform long tasks that may require several days of computation. In this paper, we describe the fault tolerance mechanism of the MAG grid middleware. We describe the fault tolerance components and how they interact with each other. The components were developed as mobile agents, forming a multi-agent society providing fault tolerance for node and application crashes
Keywords :
fault tolerant computing; grid computing; mobile agents; fault tolerance; grid applications; grid computing; grid middleware; mobile agent-based computational grid; multiinstitutional resource sharing; parallel systems; Application software; Computer crashes; Computer networks; Computer peripherals; Concurrent computing; Fault tolerance; Grid computing; Middleware; Mobile agents; Mobile communication;
Conference_Titel :
Cluster Computing and the Grid, 2006. CCGRID 06. Sixth IEEE International Symposium on
Conference_Location :
Singapore
Print_ISBN :
0-7695-2585-7
DOI :
10.1109/CCGRID.2006.1630899