DocumentCode :
1362401
Title :
Similarity Join Processing on Uncertain Data Streams
Author :
Lian, Xiang ; Chen, Lei
Author_Institution :
Dept. of Comput. Sci. & Eng., Hong Kong Univ. of Sci. & Technol., Kowloon, China
Volume :
23
Issue :
11
fYear :
2011
Firstpage :
1718
Lastpage :
1734
Abstract :
Similarity join processing in the streaming environment has many practical applications such as sensor networks, object tracking and monitoring, and so on. Previous works usually assume that stream processing is conducted over precise data. In this paper, we study an important problem of similarity join processing on stream data that inherently contain uncertainty (or called uncertain data streams), where the incoming data at each time stamp are uncertain and imprecise. Specifically, we formalize this problem as join on uncertain data streams (USJ), which can guarantee the accuracy of USJ answers over uncertain data. To tackle the challenges with respect to efficiency and effectiveness such as limited memory and small response time, we propose effective pruning methods on both object and sample levels to filter out false alarms. We integrate the proposed pruning methods into an efficient query procedure that can incrementally maintain the USJ answers. Most importantly, we further design a novel strategy, namely, adaptive superset prejoin (ASP), to maintain a superset of USJ candidate pairs. ASP is in light of our proposed formal cost model such that the average USJ processing cost is minimized. We have conducted extensive experiments to demonstrate the efficiency and effectiveness of our proposed approaches.
Keywords :
data analysis; query processing; uncertainty handling; ASP; adaptive superset prejoin; formal cost model; pruning methods; query procedure; similarity join processing; time stamp; uncertain data streams; Accuracy; Databases; Global Positioning System; Monitoring; Probabilistic logic; Temperature sensors; Uncertainty; Join on uncertain data streams; adaptive superset prejoin.;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2010.208
Filename :
5611520
Link To Document :
بازگشت