Title :
A knowledge-based approach for quality-aware ETL process
Author :
Imen Hamed;Faiza Ghozzi
Author_Institution :
Higher institute of computing and multimedia, MIR@CL laboratory, University of Sfax, Tunisia
Abstract :
ETL processes (Standing for Extract, Transform and Load) are focal component in the data warehousing projects. They supply the warehouse with the necessary integrated and reconciled data. However, they are the first to blame when wrong business decisions are made, as they provide incorrect or misleading data. Therefore a correct design of this process at early stages of data warehouse (DW) project is required. This calls for a specific knowledge to design ETL process able to provide data of good quality. A way to achieve this is to provide the ETL worker (designer, monitor, developer) with the necessary knowledge. Accordingly, we propose to anticipate the most likely to happen exceptions during ETL process and then to resolve it. Consequently, we provide a set of best practices and methodologies modeled as knowledge to the benefit of the ETL worker during the process lifecycle. Finally, we instantiate a prototype as an initial validation of this approach.
Keywords :
"Unified modeling language","Business","Data mining","Data warehouses","Process modeling","Load modeling","Process control"
Conference_Titel :
Information Systems and Economic Intelligence (SIIE), 2015 6th International Conference on
DOI :
10.1109/ISEI.2015.7358731