• DocumentCode
    1186180
  • Title

    Filtering Data Streams for Entity-Based Continuous Queries

  • Author

    Cheng, Reynold ; Kao, Ben C M ; Kwan, Alan ; Prabhakar, Sunil ; Tu, Yi-Cheng

  • Author_Institution
    Univ. of Hong Kong, Hong Kong, China
  • Volume
    22
  • Issue
    2
  • fYear
    2010
  • Firstpage
    234
  • Lastpage
    248
  • Abstract
    The idea of allowing query users to relax their correctness requirements in order to improve performance of a data stream management system (e.g., location-based services and sensor networks) has been recently studied. By exploiting the maximum error (or tolerance) allowed in query answers, algorithms for reducing the use of system resources have been developed. In most of these works, however, query tolerance is expressed as a numerical value, which may be difficult to specify. We observe that in many situations, users may not be concerned with the actual value of an answer, but rather which object satisfies a query (e.g., "who is my nearest neighbor???). In particular, an entity-based query returns only the names of objects that satisfy the query. For these queries, it is possible to specify a tolerance that is "nonvalue-based.?? In this paper, we study fraction-based tolerance, a type of nonvalue-based tolerance, where a user specifies the maximum fractions of a query answer that can be false positives and false negatives. We develop fraction-based tolerance for two major classes of entity-based queries: 1) nonrank-based query (e.g., range queries) and 2) rank-based query (e.g., k-nearest-neighbor queries). These definitions provide users with an alternative to specify the maximum tolerance allowed in their answers. We further investigate how these definitions can be exploited in a distributed stream environment. We design adaptive filter algorithms that allow updates be dropped conditionally at the data stream sources without affecting the overall query correctness. Extensive experimental results show that our protocols reduce the use of network and energy resources significantly.
  • Keywords
    adaptive filters; information filtering; query processing; adaptive filter algorithms; data stream filtering; data stream management system; distributed stream environment; energy resources; entity-based continuous queries; fraction-based tolerance; network resources; nonrank-based query; query tolerance; rank-based query; Data streams; adaptive filters; continuous queries; fraction-based tolerance.;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2009.63
  • Filename
    4798165