• DocumentCode
    3722573
  • Title

    PSG-Codes: An Erasure Codes Family with High Fault Tolerance and Fast Recovery

  • Author

    Shiyi Li;Cao Qiang;Lei Tian;Shenggang Wan;Lu Qian;Changsheng Xie

  • Author_Institution
    Wuhan Nat. Lab. for Optoelectron., Huazhong Univ. of Sci. &
  • fYear
    2015
  • Firstpage
    47
  • Lastpage
    57
  • Abstract
    As hard disk failure rates are rarely improved and the reconstruction time for TB-level disks typically amounts to days, multiple concurrent disk/storage node failures in datacenter storage systems become common and frequent. As a result, the erasure coding schemes used in datacenters must meet the critical requirements of high fault tolerance, high storage efficiency, and fast fault recovery. In this paper, we introduce a new XOR-based non-MDS erasure code family with an ability of tolerating up to 12-disk/node failures, called PSG-Codes. The basic idea behind PSG-Codes is to partition disks into groups, and exploit short parity chains to generate parity units. Then, the parity chain is further shortened by varying the number of parity elements for each strip. We conduct a simulation-based study to search configuration parameter space of PSG-Codes, and prove that PSG-Codes can tolerate up to 12 disk/node failures. Compared with a well-known XOR-based non-MDS code, WEAVER codes, PSG-Codes have higher storage efficiency and lower reconstruction cost. Moreover, the storage efficiency and performance of PSG-Codes are also competitive with another stat-of-the-art GF-based non-MDS codes, LRC codes.
  • Keywords
    "Fault tolerance","Fault tolerant systems","Strips","Encoding","Complexity theory","Acceleration"
  • Publisher
    ieee
  • Conference_Titel
    Reliable Distributed Systems (SRDS), 2015 IEEE 34th Symposium on
  • Electronic_ISBN
    1060-9857
  • Type

    conf

  • DOI
    10.1109/SRDS.2015.39
  • Filename
    7371567