• DocumentCode
    2600323
  • Title

    Intelligent Agents for Fault Tolerance: From Multi-agent Simulation to Cluster-Based Implementation

  • Author

    Varghese, Blesson ; McKee, Gerard ; Alexandrov, Vassil

  • Author_Institution
    Sch. of Syst. Eng., Univ. of Reading, Reading, UK
  • fYear
    2010
  • fDate
    20-23 April 2010
  • Firstpage
    985
  • Lastpage
    990
  • Abstract
    Recent research in multi-agent systems incorporate fault tolerance concepts, but does not explore the extension and implementation of such ideas for large scale parallel computing systems. The work reported in this paper investigates a swarm array computing approach, namely ´Intelligent Agents´. A task to be executed on a parallel computing system is decomposed to sub-tasks and mapped onto agents that traverse an abstracted hardware layer. The agents intercommunicate across processors to share information during the event of a predicted core/processor failure and for successfully completing the task. The feasibility of the approach is validated by simulations on an FPGA using a multi-agent simulator, and implementation of a parallel reduction algorithm on a computer cluster using the Message Passing Interface.
  • Keywords
    message passing; multi-agent systems; parallel algorithms; pattern clustering; software fault tolerance; FPGA; abstracted hardware layer; cluster-based implementation; fault tolerance; intelligent agents; message passing interface; multiagent simulator; multiagent systems; parallel computing system; parallel reduction algorithm; swarm array computing approach; Computational modeling; Computer simulation; Fault tolerance; Fault tolerant systems; Field programmable gate arrays; Hardware; Intelligent agent; Large-scale systems; Multiagent systems; Parallel processing; cluster-based implementation; fault tolerance; intelligent agents; swarm-array computing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advanced Information Networking and Applications Workshops (WAINA), 2010 IEEE 24th International Conference on
  • Conference_Location
    Perth, WA
  • Print_ISBN
    978-1-4244-6701-3
  • Type

    conf

  • DOI
    10.1109/WAINA.2010.21
  • Filename
    5480939