Title :
Quality-of-data for consistency levels in geo-replicated cloud data stores
Author :
Garcia-Recuero, Alvaro ; Esteves, Simao ; Veiga, Luis
Author_Institution :
Inst. Super. Tecnico (IST), INESC-ID Lisboa, Lisbon, Portugal
Abstract :
Cloud computing has recently emerged as a key technology to provide individuals and companies with access to remote computing and storage infrastructures. In order to achieve highly-available yet high-performing services, cloud data stores rely on data replication. However, providing replication brings with it the issue of consistency. Given that data are replicated in multiple geo-graphically distributed data centers, and to meet the increasing requirements of distributed applications, many cloud data stores adopt eventual consistency and therefore allow to run data intensive operations under low latency. This comes at the cost of data staleness. In this paper, we prioritize data replication based on a set of flexible data semantics that can best suit all types of Big Data applications, avoiding overloading both network and systems during large periods of disconnection or partitions in the network. Therefore we integrated these data semantics into the core architecture of a well-known NoSQL data store (e.g., HBase), which leverages a three-dimensional vector-field model (i.e., regarding timeliness, number of pending updates and divergence bounds) to provision data selectively in an on-demand fashion to applications. This enhances the former consistency model by providing a number of required levels of consistency to different applications such as, social networks or ecommerce sites, where priority of updates also differ. In addition, our implementation of the model into HBase allows updates to be tagged and grouped atomically in logical batches, akin to transactions, ensuring atomic changes and correctness of updates as they are propagated.
Keywords :
SQL; cloud computing; computer centres; replicated databases; solid modelling; storage management; 3D vector-field model; HBase; NoSQL data store; big data applications; cloud computing; consistency levels; data replication; data semantics; e-commerce sites; geo-replicated cloud data stores; high-performing services; multiple geographically distributed data centers; overloading avoidance; quality-of-data; remote computing; social networks; storage infrastructures; Adaptation models; Bandwidth; Containers; Distributed databases; Semantics; Social network services; Vectors; Geo-replication; HBase; NoSQL; YCSB;
Conference_Titel :
Cloud Computing Technology and Science (CloudCom), 2013 IEEE 5th International Conference on
Conference_Location :
Bristol
DOI :
10.1109/CloudCom.2013.29