DocumentCode :
3464109
Title :
Fault Tolerant Parallel FFT Using Parallel Failure Recovery
Author :
Fu, Hongyi ; Yang, Xuejun
Author_Institution :
Nat. Lab. for Parallel & Distrib. Process., Nat. Univ. of Defense Tech., Changsha, China
fYear :
2009
fDate :
June 29 2009-July 2 2009
Firstpage :
257
Lastpage :
261
Abstract :
This paper introduces a new method based on parallel failure recovery, for the fault tolerance issue of parallel programs. In case a process fails, other surviving processes will compute the task of the failed one in parallel, so that the overhead for fault tolerance is leveled down. The paper presents the design and implementation of the parallel FFT using the new approach, and works on finding an optimum number of processes that participate in parallel failure recovery. Finally, an experiment is done to show the better performance of the parallel failure recovery over that of checkpointing, and to show the effectiveness of our solution for the best number of processes participating parallel failure recovery.
Keywords :
checkpointing; fast Fourier transforms; fault tolerant computing; parallel programming; checkpointing; fault tolerance; fault tolerant parallel FFT; parallel failure recovery; parallel program; Aerospace industry; Application software; Books; Computational geometry; Computer networks; Conferences; Fault tolerance; Grid computing; High performance computing; Physics computing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Science and Its Applications, 2009. ICCSA '09. International Conference on
Conference_Location :
Yongin
Print_ISBN :
978-0-7695-3701-6
Type :
conf
DOI :
10.1109/ICCSA.2009.36
Filename :
5260908
Link To Document :
بازگشت