مرکز منطقه ای اطلاع رساني علوم و فناوري - AJIRA: A Lightweight Distributed Middleware for MapReduce and Stream Processing

DocumentCode :

1796848

Title :

AJIRA: A Lightweight Distributed Middleware for MapReduce and Stream Processing

Author :

Urbani, Jacopo ; Margara, Alessandro ; Jacobs, Colin ; Voulgaris, Spyros ; Bal, Henri

Author_Institution :

Dept. of Comput. .Sci., Vrije Univ. Amsterdam, Amsterdam, Netherlands

fYear :

2014

fDate :

June 30 2014-July 3 2014

Firstpage :

545

Lastpage :

554

Abstract :

Currently, MapReduce is the most popular programming model for large-scale data processing and this motivated the research community to improve its efficiency either with new extensions, algorithmic optimizations, or hardware. In this paper we address two main limitations of MapReduce: one relates to the model´s limited expressiveness, which prevents the implementation of complex programs that require multiple steps or iterations. The other relates to the efficiency of its most popular implementations (e.g., Hadoop), which provide good resource utilization only for massive volumes of input, operating sub optimally for smaller or rapidly changing input. To address these limitations, we present AJIRA, a new middleware designed for efficient and generic data processing. At a conceptual level, AJIRA replaces the traditional map/reduce primitives by generic operators that can be dynamically allocated, allowing the execution of more complex batch and stream processing jobs. At a more technical level, AJIRA adopts a distributed, multi-threaded architecture that strives at minimizing overhead for non-critical functionality. These characteristics allow AJIRA to be used as a single programming model for both batch and stream processing. To this end, we evaluated its performance against Hadoop, Spark, Esper, and Storm, which are state of the art systems for both batch and stream processing. Our evaluation shows that AJIRA is competitive in a wide range of scenarios both in terms of processing time and scalability, making it an ideal choice where flexibility, extensibility, and the processing of both large and dynamic data with a single programming model are either desirable or even mandatory requirements.

Keywords :

middleware; multi-threading; resource allocation; AJIRA; Esper; Hadoop; MapReduce; Spark; Storm; algorithmic optimizations; complex batch; dynamic data; generic data processing; generic operators; lightweight distributed middleware; multithreaded architecture; noncritical functionality; resource utilization; single programming model; stream processing jobs; Computational modeling; Computer architecture; Data models; Fault tolerance; Fault tolerant systems; Programming; Scalability;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Distributed Computing Systems (ICDCS), 2014 IEEE 34th International Conference on

Conference_Location :

Madrid

ISSN :

1063-6927

Print_ISBN :

978-1-4799-5168-0

Type :

conf

DOI :

10.1109/ICDCS.2014.62

Filename :

6888930

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1796848