• DocumentCode
    3316248
  • Title

    A failure recovery mechanism for distributed metadata servers in DCFS2

  • Author

    Fan, Zhihua ; Xiong, Jin ; Ma, Jie

  • Author_Institution
    Nat. Res. Center for Intelligent Comput. Syst., Chinese Acad. of Sci., China
  • fYear
    2004
  • fDate
    20-22 July 2004
  • Firstpage
    2
  • Lastpage
    8
  • Abstract
    Distributed metadata servers are required for cluster file system´s scalability. However, how to distribute the file system metadata among multiple metadata servers and how to make the file system reliable in case of server failures are two difficult problem. We present a journal-based failure-recovery mechanism for distributed metadata servers in the dawning cluster file system-DCFS2. The DCFS2 metadata protocol exploits a modified two-phase commit protocol which ensures consistent metadata updates on multiple metadata servers even in case of one server´s failure. We focus on the logging policy and concurrent control policy for metadata updates, and the failure recovery policy. The DCFS2 metadata protocol is compared with the two phase commit protocol and some virtues are shown. Some results of performance experiments on our system are also presented.
  • Keywords
    distributed databases; file servers; meta data; network operating systems; protocols; system recovery; DCFS2 metadata protocol; concurrent control policy; dawning cluster file system; distributed metadata servers; failure recovery; logging policy; phase commit protocol; server failures; Collaboration; Computers; Distributed computing; File servers; File systems; Intelligent systems; Power capacitors; Protocols; Robustness; Scalability;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing and Grid in Asia Pacific Region, 2004. Proceedings. Seventh International Conference on
  • Print_ISBN
    0-7695-2138-X
  • Type

    conf

  • DOI
    10.1109/HPCASIA.2004.1324010
  • Filename
    1324010