• DocumentCode
    293646
  • Title

    Process allocation for load distribution in fault-tolerant multicomputers

  • Author

    Jong Kim ; Heejo Lee ; Sunggu Lee

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Pohang Inst. of Sci. & Technol., South Korea
  • fYear
    1995
  • fDate
    27-30 June 1995
  • Firstpage
    174
  • Lastpage
    183
  • Abstract
    In this paper, we consider a load-balancing process allocation method for fault-tolerant multicomputer systems that balances the load before as well as after faults start to degrade the performance of the system. In order to be able to tolerate a single fault, each process (primary process) is duplicated (i.e. has a backup process). The backup process executes on a different processor from the primary, checkpointing the primary process and recovering the process if the primary process fails due to the occurrence of a fault. In this paper, we first formalize the problem of load-balancing process allocation and show that it is an NP-hard problem. Next, we propose a new heuristic process allocation method and analyze the performance of the proposed allocation method. Simulations are used to compare the proposed method with a process allocation method that does not take into account the different load characteristics of the primary and backup processes. While both methods perform well before the occurrence of a fault in a primary process, only the proposed method maintains a balanced load after the occurrence of such a fault.<>
  • Keywords
    computational complexity; fault tolerant computing; multiprocessor interconnection networks; performance evaluation; resource allocation; NP-hard problem; fault-tolerant multicomputers; heuristic process allocation method; load characteristics; load distribution; load-balancing process allocation method; performance degradation; process allocation; Application software; Checkpointing; Computer applications; Computer science; Degradation; Fault tolerance; Fault tolerant systems; NP-hard problem; Performance analysis; Radio access networks;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fault-Tolerant Computing, 1995. FTCS-25. Digest of Papers., Twenty-Fifth International Symposium on
  • Conference_Location
    Pasadena, CA, USA
  • Print_ISBN
    0-8186-7079-7
  • Type

    conf

  • DOI
    10.1109/FTCS.1995.466985
  • Filename
    466985