Title :
SINCA: Scalable in-memory event aggregation using clustered operators
Author :
Behera, Mahesh Kumar ; Kalyan, S. ; Venkatesh, Prasanna ; Wolski, Antoni
Author_Institution :
Huawei Technol. India Private Ltd., Bangalore, India
Abstract :
Analytical processing of various information created in the operation of social media requires queries involving grouping and aggregating of large volumes of detail data. Any advanced query processing method should take into account two dominating hardware trends: increasing main memory capacities and increasing parallel processing capacity exposed as growing number of cores per processor chip. We introduce a scalable in-memory method for data aggregation (SINCA), using clustered operators, which profits from the hardware trends. The method uses a concept of a microengine being a set of resources that can be utilized in parallel, with great efficiency. The resulting parallelized aggregation algorithm is characterized by a low overhead and high volume, and is suitable to both real-time and extract-transform-load scenarios. The core idea of the method is to use real-time histograms to partition the data for grouping. As the data is already grouped during the partitioning phase, the group aggregation can be done very efficiently. Additionally, some of the grouped data can be cached for re-use in subsequent queries.
Keywords :
data handling; parallel processing; query processing; real-time systems; social networking (online); SINCA; analytical processing; clustered operators; data aggregation; extract-transform-load scenarios; parallel processing; query processing; real-time systems; scalable in-memory event aggregation; social media; Aggregates; Clustering algorithms; Hardware; Instruction sets; Market research; Partitioning algorithms; Sockets;
Conference_Titel :
Data Engineering Workshops (ICDEW), 2015 31st IEEE International Conference on
Conference_Location :
Seoul
DOI :
10.1109/ICDEW.2015.7129578