• DocumentCode
    752287
  • Title

    Automatic recovery from disk failure in continuous-media servers

  • Author

    Lee, Jack Y B ; Lui, John C S

  • Author_Institution
    Dept. of Inf. Eng., Chinese Univ. of Hong Kong, Shatin, China
  • Volume
    13
  • Issue
    5
  • fYear
    2002
  • fDate
    5/1/2002 12:00:00 AM
  • Firstpage
    499
  • Lastpage
    515
  • Abstract
    Continuous-media (CM) servers have been around for some years. Apart from server capacity, another important issue in the deployment of CM servers is reliability. This study investigates rebuild algorithms for automatically rebuilding data stored in a failed disk into a spare disk. Specifically, a block-based rebuild algorithm is studied with the rebuild time and buffer requirement modeled. A buffer-sharing scheme is then proposed to eliminate the additional buffers needed by the rebuild process. To further improve rebuild performance, a track-based rebuild algorithm that rebuilds lost data in tracks is proposed and analyzed. Results show that track-based rebuild, while it substantially outperforms block-based rebuild, requires significantly more buffers (17-135 percent more) even with buffer sharing. To tackle this problem, a novel pipelined rebuild algorithm is proposed to take advantage of the sequential property of track retrievals to pipeline the reading and writing processes. This pipelined rebuild algorithm achieves the same rebuild performance as track-based rebuild, but reduces the extra buffer requirement to insignificant levels (0.7-1.9 percent). Numerical results computed using models of five commercial disk drives demonstrate that automatic rebuild of a failed disk can be done in a reasonable amount of time, even at relatively high server utilization (e.g., less than 1.5 hours at 90 percent utilization)
  • Keywords
    buffer storage; fault tolerant computing; multimedia computing; multimedia servers; pipeline processing; automatic disk failure recovery; block-based rebuild algorithm; buffer requirement; buffer-sharing scheme; continuous media servers; disk drives; pipelined rebuild algorithm; reading processes; rebuild time; reliability; spare disk; track retrievals; track-based rebuild algorithm; writing processes; Algorithm design and analysis; Degradation; Disk drives; Fault tolerance; Information retrieval; Lifting equipment; Performance analysis; Pipelines; Streaming media; Writing;
  • fLanguage
    English
  • Journal_Title
    Parallel and Distributed Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9219
  • Type

    jour

  • DOI
    10.1109/TPDS.2002.1003860
  • Filename
    1003860