• DocumentCode
    2573708
  • Title

    Dependable parallel computing with agents based on a task graph model

  • Author

    Chabridon, Sophie ; Gelenbe, Erol

  • Author_Institution
    UFR de Math. et Inf., Univ. Rene Descartes, Paris, France
  • fYear
    1995
  • fDate
    25-27 Jan 1995
  • Firstpage
    350
  • Lastpage
    357
  • Abstract
    We discuss a novel technique for improving the dependability of parallel programs executing on a MIMD shared memory architecture. The idea is to empower certain tasks of each application program to carry out failure detection, and to reschedule the execution of those tasks which are considered to have failed. The technique we propose is based on a task graph representation of the parallel program, in which communications between tasks have been voluntarily isolated to the end of each task which is being considered. We propose and evaluate several algorithms which can detect failures and restart failed tasks. A discrete-event simulator is used to evaluate the performance under the effect of failures, with the use of our detection and restart algorithms, of a specific parallel application: the fast Fourier transform
  • Keywords
    discrete event simulation; parallel processing; parallel programming; software performance evaluation; MIMD shared memory architecture; agents; application program; dependable parallel computing; discrete-event simulator; failure detection; fast Fourier transform; parallel programs; performance evaluation; task graph model; Algorithm design and analysis; Computational modeling; Computer architecture; Concurrent computing; Discrete event simulation; Fast Fourier transforms; Hardware; Memory architecture; Parallel processing; Very large scale integration;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing, 1995. Proceedings. Euromicro Workshop on
  • Conference_Location
    San Remo
  • Print_ISBN
    0-8186-7031-2
  • Type

    conf

  • DOI
    10.1109/EMPDP.1995.389188
  • Filename
    389188