DocumentCode :
3772330
Title :
A Data Placement Strategy for Big Data Based on DCC in Cloud Computing Systems
Author :
Tao Wang;Shihong Yao;Zhengquan Xu;Shan Jia;Qiang Xu
Author_Institution :
Collaborative Innovation Center for Geospatial Technol., Wuhan, China
fYear :
2015
Firstpage :
623
Lastpage :
630
Abstract :
In complex and data-intensive applications, data scheduling between data centers must occur when multiple datasets stored in distributed data centers are processed by one computation. To store massive datasets effectively and reduce data scheduling between data centers during the execution of computations, a mathematical model of data scheduling between data centers in cloud computing is built and dynamic computation correlation (DCC) between datasets is defined. Then a data placement strategy for big data based on DCC is proposed. Datasets with high DCC are placed into the same data center, and new datasets are dynamically distributed into the most appropriate data center. Comprehensive experiments show that the proposed strategy can effectively reduce the number of data scheduling between data centers and has a considerably low and almost constant computational complexity when the number of data centers increases and the datasets are massive. It can be expected that the proposed strategy will be applicable to the practical large-scale distributed storage systems for big data management.
Keywords :
"Processor scheduling","Distributed databases","Correlation","Cloud computing","Big data","Dynamic scheduling","Mathematical model"
Publisher :
ieee
Conference_Titel :
Smart City/SocialCom/SustainCom (SmartCity), 2015 IEEE International Conference on
Type :
conf
DOI :
10.1109/SmartCity.2015.139
Filename :
7463793
Link To Document :
بازگشت