• DocumentCode
    3012693
  • Title

    Duality-based subsequence matching in time-series databases

  • Author

    Moon, Yang-Sae ; Whang, Kyu-Young ; Loh, Woong-Kee

  • Author_Institution
    Dept. of Comput. Sci., Korea Adv. Inst. of Sci. & Technol., Taejon, South Korea
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    263
  • Lastpage
    272
  • Abstract
    The authors propose a subsequence matching method, Dual Match, which exploits duality in constructing windows and significantly improves performance. Dual Match divides data sequences into disjoint windows and the query sequence into sliding windows, and thus, is a dual approach of the one by C. Faloutsos et al. (1994), which divides data sequences into sliding windows and the query sequence into disjoint windows. We formally prove that our dual approach is correct, i.e., it incurs no false dismissal. We also prove that, given the minimum query length, there is a maximum bound of the window size to guarantee correctness of Dual Match and discuss the effect of the window size on performance. FRM causes a lot of false alarms by storing minimum bounding rectangles rather than individual points representing windows to avoid excessive storage space required for the index. Dual Match solves this problem by directly storing points, but without incurring excessive storage overhead. Experimental results show that, in most cases, Dual Match provides large improvement in both false alarms and performance over FRM, given the same amount of storage space. In particular, for low selectivities (less than 10-4), Dual Match significantly improves performance up to 430-fold. On the other hand, for high selectivities(more than 10-2), it shows a very minor degradation (less than 29%). For selectivities in between (10-4~10-2), Dual Match shows performance slightly better than that of FRM. Dual Match is also 4.10~25.6 times faster than FRM in building indexes of approximately the same size. Overall, these results indicate that our approach provides a new paradigm in subsequence matching that improves performance significantly in large database applications
  • Keywords
    pattern matching; query processing; sequences; temporal databases; time series; Dual Match; FRM; data sequences; disjoint windows; duality based subsequence matching; false dismissal; large database applications; maximum bound; minimum bounding rectangles; minimum query length; query sequence; sliding windows; subsequence matching method; time series databases; window size; Biomedical measurements; Computer science; Data mining; Databases; Degradation; Euclidean distance; Exchange rates; Information technology; Moon; Tin;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering, 2001. Proceedings. 17th International Conference on
  • Conference_Location
    Heidelberg
  • ISSN
    1063-6382
  • Print_ISBN
    0-7695-1001-9
  • Type

    conf

  • DOI
    10.1109/ICDE.2001.914837
  • Filename
    914837