• DocumentCode
    1960424
  • Title

    Efficient searches for similar subsequences of different lengths in sequence databases

  • Author

    Park, Sanghyun ; Chu, Wesley W. ; Yoon, Jeehee ; Hsu, Chihcheng

  • Author_Institution
    California Univ., Los Angeles, CA, USA
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    23
  • Lastpage
    32
  • Abstract
    We propose an indexing technique for fast retrieval of similar subsequences using time warping distances. A time warping distance is a more suitable similarity measure than the Euclidean distance in many applications, where sequences may be of different lengths or different sampling rates. Our indexing technique uses a disk-based suffix tree as an index structure and employs lower-bound distance functions to filter out dissimilar subsequences without false dismissals. To make the index structure compact and thus accelerate the query processing, we convert sequences of continuous values to sequences of discrete values via a categorization method and store only a subset of suffixes whose first values are different from their preceding values. The experimental results reveal that our proposed technique can be a few orders of magnitude faster than sequential scanning
  • Keywords
    database indexing; query processing; tree data structures; Euclidean distance; continuous values; database indexing; discrete values; disk-based suffix tree; experimental results; lower-bound distance functions; query processing; sampling rates; sequence databases; sequential scanning; similar subsequence searching; similarity measure; time warping distances; Acceleration; Databases; Euclidean distance; Filters; Indexing; Information retrieval; Length measurement; Query processing; Sampling methods; Time measurement;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering, 2000. Proceedings. 16th International Conference on
  • Conference_Location
    San Diego, CA
  • ISSN
    1063-6382
  • Print_ISBN
    0-7695-0506-6
  • Type

    conf

  • DOI
    10.1109/ICDE.2000.839384
  • Filename
    839384