• DocumentCode
    1191332
  • Title

    Duplicate-Insensitive Order Statistics Computation over Data Streams

  • Author

    Zhang, Ying ; Lin, Xuemin ; Yuan, Yidong ; Kitsuregawa, Masaru ; Zhou, Xiaofang ; Yu, Jeffrey Xu

  • Author_Institution
    Sch. of Comput. Sci. & Eng., Univ. of New South Wales, Sydney, NSW, Australia
  • Volume
    22
  • Issue
    4
  • fYear
    2010
  • fDate
    4/1/2010 12:00:00 AM
  • Firstpage
    493
  • Lastpage
    507
  • Abstract
    Duplicates in data streams may often be observed by the projection on a subspace and/or multiple recordings of objects. Without the uniqueness assumption on observed data elements, many conventional aggregates computation problems need to be further investigated due to their duplication-sensitive nature. In this paper, we present novel, space-efficient, one-scan algorithms to continuously maintain duplicate-insensitive order sketches so that rank-based queries can be approximately processed with a relative rank error guarantee epsilon in the presence of data duplicates. Besides the space efficiency, the proposed algorithms are time-efficient and highly accurate. Moreover, our techniques may be immediately applied to the heavy hitter problem against distinct elements and to the existing fault-tolerant distributed communication techniques. A comprehensive performance study demonstrates that our algorithms can support real-time computation against high-speed data streams.
  • Keywords
    query processing; real-time systems; software fault tolerance; statistical analysis; data streams; duplicate-insensitive order statistics computation; fault-tolerant distributed communication techniques; high-speed data streams; one-scan algorithms; real-time computation; space-efficient algorithms; Order statistic; data stream; duplicate insensitive; relative error.;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2009.68
  • Filename
    4799782