DocumentCode :
3599692
Title :
Bandwidth Modeling in Large Distributed Systems for Big Data Applications
Author :
Javadi, Bahman ; Boyu Zhang ; Taufer, Michela
Author_Institution :
Sch. of Comput., Eng. & Math., Univ. of Western Sydney, Sydney, NSW, Australia
fYear :
2014
Firstpage :
21
Lastpage :
27
Abstract :
The emergence of Big Data applications provides new challenges in data management such as processing and movement of masses of data. Volunteer computing has proven itself as a distributed paradigm that can fully support Big Data generation. This paradigm uses a large number of heterogeneous and unreliable Internet-connected hosts to provide Peta-scale computing power for scientific projects. With the increase in data size and number of devices that can potentially join a volunteer computing project, the host bandwidth can become a main hindrance to the analysis of the data generated by these projects, especially if the analysis is a concurrent approach based on either in-situ or in-transit processing. In this paper, we propose a bandwidth model for volunteer computing projects based on the real trace data taken from the Docking@Home project with more than 280,000 hosts over a 5-year period. We validate the proposed statistical model using model-based and simulation-based techniques. Our modeling provides us with valuable insights on the concurrent integration of data generation with in-situ and in-transit analysis in the volunteer computing paradigm.
Keywords :
Big Data; Internet; concurrent engineering; distributed databases; statistical analysis; volunteer computing; Big Data applications; Big Data generation; Internet connected hosts; Peta scale computing power; bandwidth model; concurrent generated data integration; data analysis; data management; in-situ analysis; in-transit analysis; large distributed system; model-based techniques; scientific projects; simulation-based techniques; statistical model; volunteer computing; Bandwidth; Computational modeling; Computer applications; Data models; Distributed processing; Predictive models; Servers; Big Data; Internet Bandwidth; Statistical Modeling; Volunteer Computing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Computing, Applications and Technologies (PDCAT), 2014 15th International Conference on
Type :
conf
DOI :
10.1109/PDCAT.2014.12
Filename :
7174761
Link To Document :
بازگشت