DocumentCode :
2262765
Title :
Improving grid monitoring with data quality assessment
Author :
Liu, Wei ; Luo, Tiejian ; Song, Jinliang ; Chen, Su
Author_Institution :
Graduate Univ. of Chinese Acad. of Sci., Beijing
fYear :
2007
fDate :
17-19 Oct. 2007
Firstpage :
1534
Lastpage :
1539
Abstract :
As Grid emerges as a cyber-infrastructure for the next-generation of e-Science applications, monitoring Grid becomes a very significant task. A typical Grid application is composed of a large number of resources that can fail, including network, hardware and software. Even when monitoring information from all these components is accessible, it is hard to determine whether anomalies and failures during the execution are related to a particular job. However receiving intermediate results and interacting with applications play a key role for users in reality. Considering the complexity of implementation and the large scope the monitoring system covers, there is no doubt we will face incomplete and duplicate data in many applications. Overcoming data heterogeneity is a long standing problem in the Grid research communities. It will be a disaster to handle large amount of inaccurate information where the quality of data is very poor. Fortunately, a wide spectrum of applications exhibit strong dependencies among data samples, the readings of nearby sensors are generally correlated, and the components are connected with interactions. Such relations can be used for promoting the quality of the recorded data. This paper proposes a data cleaning approach oriented Grid monitoring model, which is based on modeling data dependencies based on entity relation graph. We bring effective data quality preprocessing approach into the Grid applications monitoring model, which is critical because many real-world Grid datasets are not perfect, but rather they contain missing, erroneous, duplicate data and other data quality problems.
Keywords :
data analysis; data flow graphs; data mining; data models; entity-relationship modelling; grid computing; monitoring; cyber-infrastructure; data cleaning approach; data dependency modelling; data heterogeneity; data quality assessment; data quality preprocessing approach; e-science application; entity relation graph; grid monitoring model; Information technology; Monitoring; Quality assessment;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Communications and Information Technologies, 2007. ISCIT '07. International Symposium on
Conference_Location :
Sydney,. NSW
Print_ISBN :
978-1-4244-0976-1
Electronic_ISBN :
978-1-4244-0977-8
Type :
conf
DOI :
10.1109/ISCIT.2007.4392260
Filename :
4392260
Link To Document :
بازگشت