DocumentCode :
3577198
Title :
Enhancing Availability and Reliability of Cloud Data through Syncopy
Author :
Tsozen Yeh ; Huichen Lee
Author_Institution :
Dept. of CSIE, Fu Jen Catholic Univ., Taipei, Taiwan
fYear :
2014
Firstpage :
125
Lastpage :
131
Abstract :
The Internet of Things (IoT) has shown its promising future recently. Cloud computing can provide the infrastructure for storing and handling the potentially enormous volume of data generated therein. Consequently, the availability and reliability of cloud data will largely affect the success of IoT. Hadoop is a very popular platform adopted in the community of cloud computing. The Hadoop Distributed File System (HDFS) is the default file system in Hadoop. HDFS keeps multiple copies of data files within a Hadoop cluster to avoid losing data. However, this approach still cannot guarantee the availability and reliability of data when fatal disasters, such as fire or earthquakes, destroy the entire Hadoop cluster. As a result, maintaining data backup among different Hadoop clusters is a must to achieve high availability and reliability of cloud data. Currently, distcp is the only tool HDFS provides to duplicate data files among Hadoop clusters installed at different locations. Unfortunately, users need to manually execute distcp, which cannot promise the timely synchronization of duplicated data files among Hadoop clusters. Besides, distcp always transfers the entire contents of data files between Hadoop clusters regardless how small the amount of new data is updated. Obviously, this could waste considerable time and network bandwidth in practice. We designed and implemented an efficient scheme, namely syncopy (synchronous copy), in HDFS to automatically conduct real time synchronization for data files duplicated among different Hadoop clusters. Compared with distcp, our experimental results show that syncopy can reduce the required time by up to 99.20%.
Keywords :
Internet of Things; cloud computing; data handling; parallel processing; HDFS; Hadoop distributed file system; Internet of Things; IoT; cloud computing; cloud data availability; cloud data reliability; synchronous copy; syncopy; Availability; Cloud computing; Computers; Real-time systems; Synchronization; Testing; Hadoop; cloud computing; internet of things;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Internet of Things (iThings), 2014 IEEE International Conference on, and Green Computing and Communications (GreenCom), IEEE and Cyber, Physical and Social Computing(CPSCom), IEEE
Print_ISBN :
978-1-4799-5967-9
Type :
conf
DOI :
10.1109/iThings.2014.27
Filename :
7059652
Link To Document :
بازگشت