DocumentCode
2266955
Title
Parallel Aggregation Queries over Star Schema: A Hierarchical Encoding Scheme and Efficient Percentile Computing as a Case
Author
Qin, Xiongpai ; Wang, Huiju ; Du, Xiaoyong ; Wang, Shan
Author_Institution
Sch. of Inf., Renmin Univ. of China, Beijing, China
fYear
2011
fDate
26-28 May 2011
Firstpage
329
Lastpage
334
Abstract
Big data analysis is a main challenge we meet recently. Cloud computing is attracting more and more big data analysis applications, due to its well scalability and fault-tolerance. Some aggregation functions, like SUM, can be computed in parallel, because they satisfy distributive law of addition. Unfortunately, some of statistical functions are not naturally parallelizable. That means they do not satisfy distributive law of addition. In this paper, we focus on percentile computing problem. We proposed an iterative-style prediction-based parallel algorithm in a distributed system. Prediction is done through a sampling technique. Experiment results verify the efficiency of our algorithm.
Keywords
cloud computing; data analysis; fault tolerant computing; parallel algorithms; query processing; sampling methods; big data analysis; cloud computing; distributed system; efficient percentile computing; fault-tolerance; hierarchical encoding scheme; iterative-style prediction-based parallel algorithm; parallel aggregation queries; sampling technique; star schema; statistical functions; Algorithm design and analysis; Convergence; Encoding; Histograms; Indexes; Prediction algorithms; Query processing; Hierarchical Encoding; Iterative; Percentile;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel and Distributed Processing with Applications (ISPA), 2011 IEEE 9th International Symposium on
Conference_Location
Busan
Print_ISBN
978-1-4577-0391-1
Electronic_ISBN
978-0-7695-4428-1
Type
conf
DOI
10.1109/ISPA.2011.34
Filename
5951927
Link To Document