• DocumentCode
    2055632
  • Title

    Building reliable distributed programs with file operations

  • Author

    Ouyang, Jinsong ; Maheshwari, Piyush

  • Author_Institution
    Dept. of Comput. Sci., Toronto Univ., Ont., Canada
  • fYear
    1997
  • fDate
    18-21 Dec 1997
  • Firstpage
    380
  • Lastpage
    385
  • Abstract
    Describes a new protocol that helps the user in building reliable distributed applications with file operations. Our file checkpointing and recovery protocol is designed to consistently checkpoint and recover user files with respect to the volatile state of the distributed program. Based on the protocol, a file I/O interface has been implemented as part of our Libra library for supporting fault tolerance in distributed applications. File operations are done using this interface whereas the complexity of checkpointing and recovering user files is hidden from the application level-the checkpointing and recovery of user files are done automatically
  • Keywords
    distributed algorithms; file organisation; memory protocols; parallel programming; software fault tolerance; software libraries; software reliability; system recovery; Libra library; distributed applications; fault tolerance; file I/O interface; file checkpointing; file operations; file recovery; protocol; reliable distributed program construction; user files; volatile program state; Application software; Buildings; Checkpointing; Computer science; Fault tolerance; Fault tolerant systems; Protocols; Runtime; Shadow mapping; Software libraries;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High-Performance Computing, 1997. Proceedings. Fourth International Conference on
  • Conference_Location
    Bangalore
  • Print_ISBN
    0-8186-8067-9
  • Type

    conf

  • DOI
    10.1109/HIPC.1997.634518
  • Filename
    634518