DocumentCode
651743
Title
Performance Tuning in Distributed Processing of ETL
Author
Ping Yang ; Zaiying Liu ; Jun Ni
Author_Institution
Sch. of Inf. Sci. & Technol., Shanghai Sanda Univ., Shanghai, China
fYear
2013
fDate
20-22 Sept. 2013
Firstpage
85
Lastpage
88
Abstract
Extract, transform, and load (ETL) is a very common and important technology for building data warehouse includes business intelligence. When people issue a very complex SQL query to acquit data from a transaction system into a data warehouse, it involves many procedures including table-joining, sort, and aggregation. Such procedures require significant retrieving step and huge data transferring from tables. The intensive querying very often causes performance issues to be concerned. Moreover, it commonly generates negative impacts on data instance resources. How to improve the performance for ETL becomes critical and challenging. This paper presents a parallel processing solution that splitting big and complex SQL query into small pieces in distributed computing manor. The proposed method aims at reducing cost of computation, while ensuring data integrity among joined tables. The innovative idea can be verified through selected test-beds of performance tuning.
Keywords
SQL; competitive intelligence; data warehouses; distributed processing; query processing; ETL; SQL query; business intelligence; data warehouse; distributed computing; distributed processing; extract transform and load; performance tuning; transaction system; Customer relationship management; Data mining; Data warehouses; Educational institutions; Transforms; Tuning; Capture data changes; Data extraction; ETL; Performance tuning; load;
fLanguage
English
Publisher
ieee
Conference_Titel
Internet Computing for Engineering and Science (ICICSE), 2013 Seventh International Conference on
Conference_Location
Shanghai
Type
conf
DOI
10.1109/ICICSE.2013.24
Filename
6680060
Link To Document