DocumentCode
2011609
Title
Efficient, Chunk-Replicated Node Partitioned Data Warehouses
Author
Furtado, Pedro
fYear
2008
fDate
10-12 Dec. 2008
Firstpage
578
Lastpage
583
Abstract
Much has been said about processing efficiently data in parallel database servers, and some data warehouse applications must process in the order of tens to hundreds of Gigabytes efficiently. Yet, there is no effective approach targeted at using non-dedicated low-cost platforms efficiently in this context. Imagine taking together 10 or 1000 commodity PCs and setting-up a data crunching platform for large database-resident data with acceptable performance. There are significant inter-related data layout and processing challenges when the computational, storage and network hardware are heterogeneous and slow. We propose how to place, replicate and load-balance the data efficiently in this context. This work innovates in several respects: being practically as fast as full-mirroring without its overhead, exploring schema, chunk-wise placement, replication and load-balanced processing to be faster and more flexible than previous efforts. Our findings are complemented by an evaluation using TPC-H performance benchmark queries.
Keywords
data warehouses; parallel databases; TPC-H performance benchmark queries; chunk-replicated node partitioned data warehouses; data layout; data processing; load-balanced processing; parallel database servers; Computer networks; Data warehouses; Distributed databases; Distributed processing; Hardware; Image databases; Parallel processing; Personal communication networks; Relational databases; Switches; parallel databases; performance;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel and Distributed Processing with Applications, 2008. ISPA '08. International Symposium on
Conference_Location
Sydney, NSW
Print_ISBN
978-0-7695-3471-8
Type
conf
DOI
10.1109/ISPA.2008.86
Filename
4725197
Link To Document