DocumentCode
2262765
Title
Improving grid monitoring with data quality assessment
Author
Liu, Wei ; Luo, Tiejian ; Song, Jinliang ; Chen, Su
Author_Institution
Graduate Univ. of Chinese Acad. of Sci., Beijing
fYear
2007
fDate
17-19 Oct. 2007
Firstpage
1534
Lastpage
1539
Abstract
As Grid emerges as a cyber-infrastructure for the next-generation of e-Science applications, monitoring Grid becomes a very significant task. A typical Grid application is composed of a large number of resources that can fail, including network, hardware and software. Even when monitoring information from all these components is accessible, it is hard to determine whether anomalies and failures during the execution are related to a particular job. However receiving intermediate results and interacting with applications play a key role for users in reality. Considering the complexity of implementation and the large scope the monitoring system covers, there is no doubt we will face incomplete and duplicate data in many applications. Overcoming data heterogeneity is a long standing problem in the Grid research communities. It will be a disaster to handle large amount of inaccurate information where the quality of data is very poor. Fortunately, a wide spectrum of applications exhibit strong dependencies among data samples, the readings of nearby sensors are generally correlated, and the components are connected with interactions. Such relations can be used for promoting the quality of the recorded data. This paper proposes a data cleaning approach oriented Grid monitoring model, which is based on modeling data dependencies based on entity relation graph. We bring effective data quality preprocessing approach into the Grid applications monitoring model, which is critical because many real-world Grid datasets are not perfect, but rather they contain missing, erroneous, duplicate data and other data quality problems.
Keywords
data analysis; data flow graphs; data mining; data models; entity-relationship modelling; grid computing; monitoring; cyber-infrastructure; data cleaning approach; data dependency modelling; data heterogeneity; data quality assessment; data quality preprocessing approach; e-science application; entity relation graph; grid monitoring model; Information technology; Monitoring; Quality assessment;
fLanguage
English
Publisher
ieee
Conference_Titel
Communications and Information Technologies, 2007. ISCIT '07. International Symposium on
Conference_Location
Sydney,. NSW
Print_ISBN
978-1-4244-0976-1
Electronic_ISBN
978-1-4244-0977-8
Type
conf
DOI
10.1109/ISCIT.2007.4392260
Filename
4392260
Link To Document