• DocumentCode
    446477
  • Title

    Feasibility study and early experimental results towards cluster survivability

  • Author

    Leangsuksun, Chokchai ; Tikotekar, Anand ; Pourzandi, Makan ; Haddad, Ibrahim

  • Author_Institution
    Louisiana Tech Univ., Ruston, LA, USA
  • Volume
    1
  • fYear
    2005
  • fDate
    9-12 May 2005
  • Firstpage
    77
  • Abstract
    This paper propounds an investigation, a feasibility study, and performance benchmarking of vital management elements for critical enterprise and HPC infrastructure. We propose concepts of integrating high availability cluster mechanism with a secure cluster infrastructure. Our proposed architecture incorporates the distributed security infrastructure (DSI) framework, an open source project providing secure infrastructure for carrier grade clusters, and HA-OSCAR, an open source cluster framework that meets the reliability, availability, serviceability (RAS) needs. The result is a cluster infrastructure that is compliant with the reliability, availability, serviceability and security (RASS) principles. We conducted an initial feasibility study and experiment to gauge issues and the degree of success in the implementation of our proposed RASS framework. We verified the integration of HA-OSCAR release 1.0 and DSI release 0.3. Although there was a minimal performance overhead, having "RASS" in mission critical settings by far outweighs the performance impact. We plan to further our proof-of-concept architecture to suit the required needs on the production environments.
  • Keywords
    public domain software; security of data; workstation clusters; HA-OSCAR; HPC infrastructure; availability needs; carrier grade clusters; cluster survivability; critical enterprise; distributed security infrastructure; high availability cluster mechanism; open source cluster framework; open source project; performance benchmarking; proof-of-concept architecture; reliability needs; secure cluster infrastructure; serviceability needs; vital management elements; Access control; Authentication; Availability; Engineering management; Laboratories; Linux; Open systems; Packaging; Research and development management; Security;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing and the Grid, 2005. CCGrid 2005. IEEE International Symposium on
  • Print_ISBN
    0-7803-9074-1
  • Type

    conf

  • DOI
    10.1109/CCGRID.2005.1558537
  • Filename
    1558537