DocumentCode :
3525851
Title :
An Evolutionary Algorithm for Column Family Schema Optimization in HBase
Author :
Fangzhou Yang ; Jian Cao ; Milosevic, Dragan
Author_Institution :
Dept. of CSE, Shanghai Jiaotong Univ., Shanghai, China
fYear :
2015
fDate :
March 30 2015-April 2 2015
Firstpage :
439
Lastpage :
445
Abstract :
Apache HBase is a column-oriented NoSQL key-value store built on top of the Hadoop distributed file-system. Logically, columns in HBase are grouped into column families. Physically, all columns in one column family are stored in the same set of files. Therefore the division of column families is closely related to the response time for a specific row query. In this paper, one new Evolutionary Algorithm is designed and applied to find the optimum column family schema for the given user queries. The reading performance of the optimized column family schema is evaluated on a real dataset provided by ZANOX AG, which contains 2.6 million rows of aggregated tracking data and 1.3 million user queries. It is shown that by using the found optimized column family schema, the reading performance of HBase is improved with a statistical significance. User queries from a testing set show that the average response time is reduced by up to 72% compared to un-optimized column family schemas.
Keywords :
SQL; data handling; evolutionary computation; parallel processing; query processing; statistical analysis; Apache HBase; Hadoop distributed file-system; ZANOX AG; column family schema optimization; column-oriented NoSQL key-value store; evolutionary algorithm; statistical significance; user queries; Algorithm design and analysis; Big data; Conferences; Evolutionary computation; Genetic algorithms; Layout; Optimization; Column Family; Column Layout; Evolutionary Algorithm; HBase; NoSQL; Schema Optimization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Big Data Computing Service and Applications (BigDataService), 2015 IEEE First International Conference on
Conference_Location :
Redwood City, CA
Type :
conf
DOI :
10.1109/BigDataService.2015.20
Filename :
7184913
Link To Document :
بازگشت