Title :
MyStore: A High Available Distributed Storage System for Unstructured Data
Author :
Jiang, Wenbin ; Zhang, Lei ; Qiang, Weizhong ; Jin, Hai ; Peng, Yaqiong
Author_Institution :
Services Comput. Technol. & Syst. Lab., Huazhong Univ. of Sci. & Technol., Wuhan, China
Abstract :
Although some NoSQL systems such as Dynamo, Cassandra, MongoDB have provided different advantages for unstructured data management, no one can provide flexible query functions like MongoDB, while guarantee the availability and scalability as Cassandra simultaneously. This paper introduces a new methodology and implementation for improving the availability of unstructured data by presenting a new distributed storage system called MyStore, based on the combination of MongoDB and some advantages from other NoSQL systems. Consistent hash is used to distribute data on multiple MongoDB nodes, NWR mode is applied to provide automatic backup operation and guarantee data consistency. Gossip protocol is taken for exchanging information of failures in the system. Based on above strategies, a high available and scalable system for storing unstructured data is realized, which can also provide complex query functions like rational database. Moreover, this system is applied in a multi-discipline virtual experiment platform named VeePalms that requires high availability and high access efficiency for its unstructured data such as XML scene, video guideline. Experimental evaluation shows that the methodology is powerful enough not only to enhance the data availability, but also to improve the server´s scalability.
Keywords :
SQL; data integrity; distributed processing; failure analysis; information storage; protocols; query processing; Cassandra; MyStore; NWR mode; NoSQL systems; VeePalms; automatic backup operation; complex query functions; data consistency; data distribution; flexible query functions; gossip protocol; high access efficiency; high available distributed storage system; information exchange; multidiscipline virtual experiment platform; multiple MongoDB nodes; server scalability; unstructured data storage; Availability; Distributed databases; Generators; Memory; Protocols; Scalability; Writing; distributed data storage; high availability; unstructured data;
Conference_Titel :
High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems (HPCC-ICESS), 2012 IEEE 14th International Conference on
Conference_Location :
Liverpool
Print_ISBN :
978-1-4673-2164-8
DOI :
10.1109/HPCC.2012.39