Title :
Scalable Query Optimization for Efficient Data Processing Using MapReduce
Author :
Yi Shan ; Yi Chen
Author_Institution :
Sch. of Comput., Inf. & Decision Syst. Eng., Arizona State Univ., Tempe, AZ, USA
Abstract :
MapReduce is widely acknowledged by both industry and academia as an effective programming model for query processing on big data. It is crucial to design an optimizer which finds the most efficient way to execute an SQL query using MapReduce. However, existing work in parallel query processing either falls short of optimizing an SQL query using MapReduce or the time complexity of the optimizer it uses is exponential. Also, industry solutions such as HIVE, and YSmart do not optimize the join sequence of an SQL query and cannot guarantee an optimal execution plan. In this paper, we propose a scalable optimizer for SQL queries using MapReduce, named SOSQL. Experiments performed on Google Cloud Platform confirmed the scalability and efficiency of SOSQL over existing work.
Keywords :
Big Data; SQL; parallel processing; query processing; Big Data; Google cloud platform; MapReduce; SOSQL; SQL query; data processing; parallel query processing; programming model; scalable optimizer; scalable query optimization; Big data; Google; Industries; Optimization; Partitioning algorithms; Query processing; Time complexity;
Conference_Titel :
Big Data (BigData Congress), 2015 IEEE International Congress on
Conference_Location :
New York, NY
Print_ISBN :
978-1-4673-7277-0
DOI :
10.1109/BigDataCongress.2015.100