DocumentCode
2875158
Title
Group-by Query Process in Middleware of Large Scale Data Intensive Systems
Author
Song Huaiming ; An Mingyuan ; Wang Yang ; Wang Weiping ; Sun Ninghui
Author_Institution
Key Lab. of Comput. Syst. & Archit., Chinese Acad. of Sci., Beijing, China
fYear
2009
fDate
9-11 July 2009
Firstpage
82
Lastpage
89
Abstract
Large scale data intensive systems are available in many fields in recent years, and itpsilas a severe challenge for group-by query of large volume of data in a cluster based on shared-nothing architecture. This paper proposes a design of a parallel query engine (PQE) and its asynchronous improvement (APQE) for group-by queries. PQE and APQE support for pipelined query processing and develop maximum degree of pipeline parallelism. APQE further eliminates the synchronous overhead of multi nodes parallelism, and returns part of final result as early as possible if no data dependency exists. Experimental results demonstrate that, compared to previous 2-step query engine, PQE and APQE can make a significant performance improvement for group-by query of large data sets in a shared-nothing cluster system, as well as obviously better scalability.
Keywords
middleware; pipeline processing; query processing; asynchronous improvement; cluster system; group-by query process; large scale data intensive system; middleware; multinods parallelism; parallel query engine; pipeline parallelism; pipelined query processing; synchronous overhead; Computer architecture; Databases; Design optimization; Engines; Large-scale systems; Middleware; Parallel processing; Pipeline processing; Query processing; Scalability; asynchronous pipeline; group-by query; parallel query engine; result merge;
fLanguage
English
Publisher
ieee
Conference_Titel
Networking, Architecture, and Storage, 2009. NAS 2009. IEEE International Conference on
Conference_Location
Hunan
Print_ISBN
978-0-7695-3741-2
Type
conf
DOI
10.1109/NAS.2009.19
Filename
5197303
Link To Document