Title :
Stochastic Analysis on RAID Reliability for Solid-State Drives
Author :
Yongkun Li ; Lee, Patrick P. C. ; Lui, John C. S.
Author_Institution :
Sch. of Comput. Sci. & Technol., Univ. of Sci. & Technol. of China, Hefei, China
fDate :
Sept. 30 2013-Oct. 3 2013
Abstract :
Solid-state drives (SSDs) have been widely deployed in desktops and data centers. However, SSDs suffer from bit errors, and the bit error rate is time dependent since it increases as an SSD wears down. Traditional storage systems mainly use parity-based RAID to provide reliability guarantees by striping redundancy across multiple devices, but the effectiveness of RAID in SSDs remains debatable as parity updates aggravate the wearing and bit error rates of SSDs. In particular, an open problem is that how different parity distributions over multiple devices, such as the even distribution suggested by conventional wisdom, or uneven distributions proposed in recent RAID schemes for SSDs, may influence the reliability of an SSD RAID array. To address this fundamental problem, we propose the first analytical model to quantify the reliability dynamics of an SSD RAID array. Specifically, we develop a "non-homogeneous" continuous time Markov chain model, and derive the transient reliability solution. We validate our model via trace-driven simulations and conduct numerical analysis to provide insights into the reliability dynamics of SSD RAID arrays under different parity distributions and subject to different bit error rates and array configurations. Designers can use our model to decide the appropriate parity distribution based on their reliability requirements.
Keywords :
Markov processes; RAID; disc drives; redundancy; SSD RAID array reliability dynamics; array configurations; bit error rate; data centers; desktops; nonhomogeneous continuous time Markov chain model; numerical analysis; parity distribution; reliability requirements; solid-state drives; stochastic analysis; trace-driven simulation; transient reliability solution; Aging; Arrays; Error analysis; Markov processes; Numerical models; Reliability; Transient analysis; CTMC; RAID; Reliability; Solid-state Drives; Transient Analysis;
Conference_Titel :
Reliable Distributed Systems (SRDS), 2013 IEEE 32nd International Symposium on
Conference_Location :
Braga
DOI :
10.1109/SRDS.2013.16