DocumentCode
2492407
Title
ParaCube: A Scalable OLAP Model Based on Distributed Aggregate Computing with Sibling Cubes
Author
Zhang, Yansong ; Wang, Shan ; Huang, Wei
Author_Institution
Key Lab. of the Minist. of Educ. for Data Eng. & Knowledge Eng., Beijing, China
fYear
2010
fDate
6-8 April 2010
Firstpage
323
Lastpage
329
Abstract
The requirements of OLAP applications increase rapidly by dramatically increased data volume, users, query volume and query complexity. The requirement for shortening update period in data warehouse is another crucial factor for a scalable OLAP application. In this paper, we propose a scalable OLAP prototype to support the query processing with increasing data volume by distributing the whole fact tuples to multiple servers to construct a set of sibling cubes which can be merged together to obtain the whole cube. We employ a light weight distribution policy with fully duplicated dimension tables in each sibling server on the observation of very low proportion of space cost for dimension tables. OLAP query with distributed aggregate functions can be transformed into queries to be performed parallel in sibling servers. For non-distributed computing aggregate functions, such as median, the optimized median aggregate computing algorithm is proposed to reduce transmission volume between servers while computing the global median values. We also present a three-level framework in data warehouse to meet the requirement of shorter update period in "operational business intelligence". An asynchronous tunnel model is proposed to reduce update latency by pre-fetching updated tuples to OLAP processing server. Finally, we set up prototype system ParaCube to evaluate performance in SN (shared-nothing) system and multi-core platforms.
Keywords
data mining; data warehouses; distributed processing; query processing; ParaCube; asynchronous tunnel model; data volume; data warehouse; distributed aggregate computing; multicore platforms; nondistributed computing aggregate functions; operational business intelligence; optimized median aggregate computing algorithm; query complexity; query processing; query volume; scalable OLAP model; shared-nothing system; sibling cubes; Acceleration; Aggregates; Application software; Concurrent computing; Data warehouses; Distributed computing; Material storage; Merging; Prototypes; Query processing; ParaCube; distributed aggregate; median; sibling cube;
fLanguage
English
Publisher
ieee
Conference_Titel
Web Conference (APWEB), 2010 12th International Asia-Pacific
Conference_Location
Busan
Print_ISBN
978-1-7695-4012-2
Electronic_ISBN
978-1-4244-6600-9
Type
conf
DOI
10.1109/APWeb.2010.31
Filename
5474121
Link To Document