• DocumentCode
    4889
  • Title

    Redistribute Data to Regain Load Balance during RAID-4 Scaling

  • Author

    Guangyan Zhang ; Jigang Wang ; Keqin Li ; Jiwu Shu ; Weimin Zheng

  • Author_Institution
    Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
  • Volume
    26
  • Issue
    1
  • fYear
    2015
  • fDate
    Jan. 2015
  • Firstpage
    219
  • Lastpage
    229
  • Abstract
    Disk additions to a RAID-4 storage system can increase the I/O parallelism and expand the storage capacity simultaneously. To regain load balance among all disks including old and new, RAID-4 scaling requires moving certain data blocks onto newly added disks. Existing data redistribution approaches to RAID-4 scaling, restricted by preserving a round-robin data distribution, require migrating all the data, which results in an expensive cost for RAID-4 scaling. In this paper, we propose McPod-a new data redistribution approach to accelerating RAID-4 scaling. McPod minimizes the number of data blocks to be moved while maintaining a uniform data distribution across all data disks. McPod also optimizes data migration with four techniques. First, it coalesces multiple accesses to physically successive blocks into a single I/O. Second, it piggybacks parity updates during data migration to reduce the cost of maintaining consistent parities. Third, it outsources all parity updates brought by RAID scaling to a surrogate disk. Fourth, it delays recording data migration on disks to minimize the number of metadata writes without compromising data reliability. We implement McPod in Linux Kernel 2.6.32.9, and evaluate its performance by replaying three real-system traces. The results demonstrate that McPod outperforms the existing “moving-everything” approach by 67.78-79.64 percent in redistribution time and by 14.24-27.16 percent in user response time. The experiments also illustrate that the performance of the RAID scaled using McPod is almost identical to that of the round-robin RAID.
  • Keywords
    Linux; RAID; operating system kernels; resource allocation; Linux Kernel; McPod approach; RAID-4 scaling; RAID-4 storage system; data blocks; data migration; data redistribution approach; data reliability; disk addition; input-output parallelism; load balancing; meta data; moving-everything approach; parity updates; round-robin RAID; round-robin data distribution; user response time; Acceleration; Distributed databases; Layout; Nickel; Outsourcing; Parallel processing; Reliability; Access coalescing; I/O parallelism; RAID-4 scaling; data migration; metadata update; parity update;
  • fLanguage
    English
  • Journal_Title
    Parallel and Distributed Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9219
  • Type

    jour

  • DOI
    10.1109/TPDS.2014.2308219
  • Filename
    6748089