• DocumentCode
    3464109
  • Title

    Fault Tolerant Parallel FFT Using Parallel Failure Recovery

  • Author

    Fu, Hongyi ; Yang, Xuejun

  • Author_Institution
    Nat. Lab. for Parallel & Distrib. Process., Nat. Univ. of Defense Tech., Changsha, China
  • fYear
    2009
  • fDate
    June 29 2009-July 2 2009
  • Firstpage
    257
  • Lastpage
    261
  • Abstract
    This paper introduces a new method based on parallel failure recovery, for the fault tolerance issue of parallel programs. In case a process fails, other surviving processes will compute the task of the failed one in parallel, so that the overhead for fault tolerance is leveled down. The paper presents the design and implementation of the parallel FFT using the new approach, and works on finding an optimum number of processes that participate in parallel failure recovery. Finally, an experiment is done to show the better performance of the parallel failure recovery over that of checkpointing, and to show the effectiveness of our solution for the best number of processes participating parallel failure recovery.
  • Keywords
    checkpointing; fast Fourier transforms; fault tolerant computing; parallel programming; checkpointing; fault tolerance; fault tolerant parallel FFT; parallel failure recovery; parallel program; Aerospace industry; Application software; Books; Computational geometry; Computer networks; Conferences; Fault tolerance; Grid computing; High performance computing; Physics computing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Science and Its Applications, 2009. ICCSA '09. International Conference on
  • Conference_Location
    Yongin
  • Print_ISBN
    978-0-7695-3701-6
  • Type

    conf

  • DOI
    10.1109/ICCSA.2009.36
  • Filename
    5260908