• DocumentCode
    2407944
  • Title

    String parsing-based similarity detection

  • Author

    Yang, Jia ; Speidel, Ulrich

  • Author_Institution
    Dept. of Comput. Sci., Auckland Univ., New Zealand
  • fYear
    2005
  • fDate
    29 Aug.-1 Sept. 2005
  • Abstract
    This paper compares the similarity-detection abilities of two string parsing algorithms from the Lempel-Ziv family and the T-decomposition algorithm proposed by Titchener against the Hamming and Levenshtein measures. Our results show that LZ and T-decomposition based measures work in a wider range of contexts. We also argue that T-decomposition based measures represent a good compromise between accuracy and time complexity.
  • Keywords
    Hamming codes; computational complexity; data compression; program compilers; Hamming measure; Lempel-Ziv family; Levenshtein measure; T-decomposition algorithm; context wider range; similarity-detection ability; string parsing algorithms; Area measurement; Automata; Compression algorithms; Computer science; Data compression; Data mining; History; Length measurement; Production; Time measurement;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Theory Workshop, 2005 IEEE
  • Print_ISBN
    0-7803-9480-1
  • Type

    conf

  • DOI
    10.1109/ITW.2005.1531901
  • Filename
    1531901