DocumentCode
3464109
Title
Fault Tolerant Parallel FFT Using Parallel Failure Recovery
Author
Fu, Hongyi ; Yang, Xuejun
Author_Institution
Nat. Lab. for Parallel & Distrib. Process., Nat. Univ. of Defense Tech., Changsha, China
fYear
2009
fDate
June 29 2009-July 2 2009
Firstpage
257
Lastpage
261
Abstract
This paper introduces a new method based on parallel failure recovery, for the fault tolerance issue of parallel programs. In case a process fails, other surviving processes will compute the task of the failed one in parallel, so that the overhead for fault tolerance is leveled down. The paper presents the design and implementation of the parallel FFT using the new approach, and works on finding an optimum number of processes that participate in parallel failure recovery. Finally, an experiment is done to show the better performance of the parallel failure recovery over that of checkpointing, and to show the effectiveness of our solution for the best number of processes participating parallel failure recovery.
Keywords
checkpointing; fast Fourier transforms; fault tolerant computing; parallel programming; checkpointing; fault tolerance; fault tolerant parallel FFT; parallel failure recovery; parallel program; Aerospace industry; Application software; Books; Computational geometry; Computer networks; Conferences; Fault tolerance; Grid computing; High performance computing; Physics computing;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Science and Its Applications, 2009. ICCSA '09. International Conference on
Conference_Location
Yongin
Print_ISBN
978-0-7695-3701-6
Type
conf
DOI
10.1109/ICCSA.2009.36
Filename
5260908
Link To Document