• DocumentCode
    610374
  • Title

    AFFINITY: Efficiently querying statistical measures on time-series data

  • Author

    Sathe, Saket ; Aberer, Karl

  • Author_Institution
    EPFL, Lausanne, Switzerland
  • fYear
    2013
  • fDate
    8-12 April 2013
  • Firstpage
    841
  • Lastpage
    852
  • Abstract
    Computing statistical measures for large databases of time series is a fundamental primitive for querying and mining time-series data [1]-[6]. This primitive is gaining importance with the increasing number and rapid growth of time series databases. In this paper, we introduce a framework for efficient computation of statistical measures by exploiting the concept of affine relationships. Affine relationships can be used to infer statistical measures for time series, from other related time series, instead of computing them directly; thus, reducing the overall computational cost significantly. The resulting methods exhibit at least one order of magnitude improvement over the best known methods. To the best of our knowledge, this is the first work that presents an unified approach for computing and querying several statistical measures at once. Our approach exploits affine relationships using three key components. First, the AFCLST algorithm clusters the time-series data, such that high-quality affine relationships could be easily found. Second, the SYMEX algorithm uses the clustered time series and efficiently computes the desired affine relationships. Third, the SCAPE index structure produces a many-fold improvement in the performance of processing several statistical queries by seamlessly indexing the affine relationships. Finally, we establish the effectiveness of our approaches by performing comprehensive experimental evaluation on real datasets.
  • Keywords
    data mining; database management systems; pattern clustering; query processing; time series; AFCLST algorithm; SCAPE index structure; SYMEX algorithm; affine relationship concept; data clustering; data mining; statistical measure query; time series database; time-series data; Clustering algorithms; Correlation; Covariance matrices; Indexes; Measurement; Time series analysis; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering (ICDE), 2013 IEEE 29th International Conference on
  • Conference_Location
    Brisbane, QLD
  • ISSN
    1063-6382
  • Print_ISBN
    978-1-4673-4909-3
  • Electronic_ISBN
    1063-6382
  • Type

    conf

  • DOI
    10.1109/ICDE.2013.6544879
  • Filename
    6544879