DocumentCode :
651743
Title :
Performance Tuning in Distributed Processing of ETL
Author :
Ping Yang ; Zaiying Liu ; Jun Ni
Author_Institution :
Sch. of Inf. Sci. & Technol., Shanghai Sanda Univ., Shanghai, China
fYear :
2013
fDate :
20-22 Sept. 2013
Firstpage :
85
Lastpage :
88
Abstract :
Extract, transform, and load (ETL) is a very common and important technology for building data warehouse includes business intelligence. When people issue a very complex SQL query to acquit data from a transaction system into a data warehouse, it involves many procedures including table-joining, sort, and aggregation. Such procedures require significant retrieving step and huge data transferring from tables. The intensive querying very often causes performance issues to be concerned. Moreover, it commonly generates negative impacts on data instance resources. How to improve the performance for ETL becomes critical and challenging. This paper presents a parallel processing solution that splitting big and complex SQL query into small pieces in distributed computing manor. The proposed method aims at reducing cost of computation, while ensuring data integrity among joined tables. The innovative idea can be verified through selected test-beds of performance tuning.
Keywords :
SQL; competitive intelligence; data warehouses; distributed processing; query processing; ETL; SQL query; business intelligence; data warehouse; distributed computing; distributed processing; extract transform and load; performance tuning; transaction system; Customer relationship management; Data mining; Data warehouses; Educational institutions; Transforms; Tuning; Capture data changes; Data extraction; ETL; Performance tuning; load;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Internet Computing for Engineering and Science (ICICSE), 2013 Seventh International Conference on
Conference_Location :
Shanghai
Type :
conf
DOI :
10.1109/ICICSE.2013.24
Filename :
6680060
Link To Document :
بازگشت