DocumentCode :
1361138
Title :
An Online Data Access Prediction and Optimization Approach for Distributed Systems
Author :
Ishii, Renato Porfirio ; De Mello, Rodrigo Fernandes
Author_Institution :
Fac. of Comput. Sci., Fed. Univ. of Mato Grosso do Sul-UFMS, Campo Grande, Brazil
Volume :
23
Issue :
6
fYear :
2012
fDate :
6/1/2012 12:00:00 AM
Firstpage :
1017
Lastpage :
1029
Abstract :
Current scientific applications have been producing large amounts of data. The processing, handling and analysis of such data require large-scale computing infrastructures such as clusters and grids. In this area, studies aim at improving the performance of data-intensive applications by optimizing data accesses. In order to achieve this goal, distributed storage systems have been considering techniques of data replication, migration, distribution, and access parallelism. However, the main drawback of those studies is that they do not take into account application behavior to perform data access optimization. This limitation motivated this paper which applies strategies to support the online prediction of application behavior in order to optimize data access operations on distributed systems, without requiring any information on past executions. In order to accomplish such a goal, this approach organizes application behaviors as time series and, then, analyzes and classifies those series according to their properties. By knowing properties, the approach selects modeling techniques to represent series and perform predictions, which are, later on, used to optimize data access operations. This new approach was implemented and evaluated using the OptorSim simulator, sponsored by the LHC-CERN project and widely employed by the scientific community. Experiments confirm this new approach reduces application execution time in about 50 percent, specially when handling large amounts of data.
Keywords :
data handling; distributed processing; optimisation; LHC-CERN project; OptorSim simulator; current scientific applications; data intensive applications; distributed storage systems; distributed systems; large scale computing infrastructures; online data access prediction; optimization approach; scientific community; Analytical models; Autoregressive processes; Distributed databases; Optimization; Predictive models; Stochastic processes; Time series analysis; Distributed computing; data access optimization; distributed file system; prediction.; time series analysis;
fLanguage :
English
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on
Publisher :
ieee
ISSN :
1045-9219
Type :
jour
DOI :
10.1109/TPDS.2011.256
Filename :
6060803
Link To Document :
بازگشت