DocumentCode
3384633
Title
Towards Optimal Fault Tolerant Scheduling in Computational Grid
Author
Imran, Muhammad ; Niaz, Iftikhar Azim ; Haider, Sajjad ; Hussain, Naveed ; Ansari, M.A.
Author_Institution
Fac. of Comput., Riphah Int. Univ., Islamabad
fYear
2007
fDate
12-13 Nov. 2007
Firstpage
154
Lastpage
159
Abstract
Grid environment has significant challenges due to diverse failures encountered during job execution. Computational grids provide the main execution platform for long running jobs. Such jobs require long commitment of grid resources. Therefore fault tolerance in such an environment cannot be ignored. Most of the grid middleware have either ignored failure issues or have developed adhoc solutions. Most of the existing fault tolerance techniques are application dependant and causes cognitive problem. This paper examines existing fault detection and tolerance techniques in various middleware. We have proposed fault tolerant layered grid architecture with cross-layered design. In our approach Hybrid Particle Swarm Optimization (HPSO) algorithm and Anycast technique are used in conjunction with the Globus middleware. We have adopted a proactive and reactive fault management strategy for centralized and distributed environments. The proposed strategy is helpful in identifying root cause of failures and resolving cognitive problem. Our strategy minimizes computation and communication thus achieving higher reliability. Anycast limits the effect of Denial of Service/Distributed Denial of Service D(DoS) attacks nearest to the source of the attack thus achieving better security. Significant performance improvement is achieved through using Anycast before HPSO. The selection of more reliable nodes results in less overhead of checkpointing.
Keywords
fault tolerant computing; grid computing; middleware; particle swarm optimisation; computational grid; denial of service; fault detection techniques; fault tolerant scheduling; grid middleware; hybrid particle swarm optimization; Checkpointing; Computer crime; Environmental management; Fault detection; Fault tolerance; Grid computing; Middleware; Particle swarm optimization; Processor scheduling; Security;
fLanguage
English
Publisher
ieee
Conference_Titel
Emerging Technologies, 2007. ICET 2007. International Conference on
Conference_Location
Islamabad
Print_ISBN
978-1-4244-1493-2
Electronic_ISBN
978-1-4244-1494-9
Type
conf
DOI
10.1109/ICET.2007.4516335
Filename
4516335
Link To Document