• DocumentCode
    2719677
  • Title

    Disaster tolerant Wolfpack geo-clusters

  • Author

    Wilkins, Richard S. ; Du, Xing ; Cochran, Robert A. ; Popp, Matthias

  • Author_Institution
    Hewlett-Packard Co., Bellevue, WA, USA
  • fYear
    2002
  • fDate
    2002
  • Firstpage
    222
  • Lastpage
    227
  • Abstract
    Clustering of computer systems to increase application availability has become a common industry practice. While it does increase the availability of applications and their data to users, it does not solve the problem of a disaster (flood, tornado, earthquake, terrorism, civil unrest, etc.) making the entire cluster, and the applications and data it is serving, unavailable. Distance mirroring of an application\´s data store allows for recovery from disaster but may still result in long periods of unacceptable downtime. This paper describes a method for stretching a standard Wolfpack (Microsoft Cluster Service, MSCS) cluster of Intel architecture servers geographically for disaster tolerance. Server nodes and their storage may be placed at two (or more) distant sites to prevent a single disaster from taking down the entire cluster. Standard cluster semantics and ease of use are maintained using the remote mirroring capabilities of Hewlett-Packard\´s high-end storage arrays. The design of additional software to control data mirroring behavior when moving or failing-over applications between server nodes is described. Also, software that allows "stretching" the cluster quorum disk between sites in a manner that is transparent to the cluster software and also software for an external arbitrator node that provides rapid recovery from total loss of inter-site communications is described. Flexibility provided by the array\´s firmware mirroring options (i.e. synchronous or asynchronous I/O mirroring) allows for optimum use of inter-site link bandwidth based on the data safety requirements of individual applications.
  • Keywords
    distributed processing; fault tolerant computing; system recovery; workstation clusters; Wolfpack; cluster quorum disk; cluster semantics; clustering; data mirroring; disaster recovery; disaster tolerance; remote mirroring; Application software; Bandwidth; Computer industry; Earthquakes; Floods; Microprogramming; Safety; Software design; Terrorism; Tornadoes;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing, 2002. Proceedings. 2002 IEEE International Conference on
  • Print_ISBN
    0-7695-2066-9
  • Type

    conf

  • DOI
    10.1109/CLUSTR.2002.1137750
  • Filename
    1137750