Title :
Semantic approximation of data stream joins
Author :
Das, Abhinandan ; Gehrke, Johannes ; Riedewald, Mirek
Author_Institution :
Dept. of Comput. Sci., Cornell Univ., Ithaca, NY, USA
Abstract :
We consider the problem of approximating sliding window joins over data streams in a data stream processing system with limited resources. In our model, we deal with resource constraints by shedding load in the form of dropping tuples from the data streams. We make two main contributions. First, we define the problem space by discussing architectural models for data stream join processing and surveying suitable measures for the quality of an approximation of a set-valued query result. Second, we examine in detail a large part of this problem space. More precisely, we consider the number of generated result tuples as the quality measure and we propose optimal offline and fast online algorithms for it. In a thorough experimental study with synthetic and real data, we show the efficacy of our solutions.
Keywords :
approximation theory; computational complexity; constraint handling; error analysis; programming language semantics; query processing; relational databases; resource allocation; set theory; architectural model; data stream processing system; dropping tuples; offline algorithm; online algorithm; resource constraints; semantic approximation algorithm; semantic load shedding; set approximation error metrics; set-valued query; sliding window joins approximation; Approximation algorithms; Approximation error; Availability; IP networks; Military computing; Monitoring; Process design; Proposals; Real time systems; Relational databases; 65; Index Terms- Data streams; approximation algorithms; join processing.; semantic load shedding; set approximation error metrics;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
DOI :
10.1109/TKDE.2005.17