DocumentCode :
2011609
Title :
Efficient, Chunk-Replicated Node Partitioned Data Warehouses
Author :
Furtado, Pedro
fYear :
2008
fDate :
10-12 Dec. 2008
Firstpage :
578
Lastpage :
583
Abstract :
Much has been said about processing efficiently data in parallel database servers, and some data warehouse applications must process in the order of tens to hundreds of Gigabytes efficiently. Yet, there is no effective approach targeted at using non-dedicated low-cost platforms efficiently in this context. Imagine taking together 10 or 1000 commodity PCs and setting-up a data crunching platform for large database-resident data with acceptable performance. There are significant inter-related data layout and processing challenges when the computational, storage and network hardware are heterogeneous and slow. We propose how to place, replicate and load-balance the data efficiently in this context. This work innovates in several respects: being practically as fast as full-mirroring without its overhead, exploring schema, chunk-wise placement, replication and load-balanced processing to be faster and more flexible than previous efforts. Our findings are complemented by an evaluation using TPC-H performance benchmark queries.
Keywords :
data warehouses; parallel databases; TPC-H performance benchmark queries; chunk-replicated node partitioned data warehouses; data layout; data processing; load-balanced processing; parallel database servers; Computer networks; Data warehouses; Distributed databases; Distributed processing; Hardware; Image databases; Parallel processing; Personal communication networks; Relational databases; Switches; parallel databases; performance;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing with Applications, 2008. ISPA '08. International Symposium on
Conference_Location :
Sydney, NSW
Print_ISBN :
978-0-7695-3471-8
Type :
conf
DOI :
10.1109/ISPA.2008.86
Filename :
4725197
Link To Document :
بازگشت