DocumentCode
2407944
Title
String parsing-based similarity detection
Author
Yang, Jia ; Speidel, Ulrich
Author_Institution
Dept. of Comput. Sci., Auckland Univ., New Zealand
fYear
2005
fDate
29 Aug.-1 Sept. 2005
Abstract
This paper compares the similarity-detection abilities of two string parsing algorithms from the Lempel-Ziv family and the T-decomposition algorithm proposed by Titchener against the Hamming and Levenshtein measures. Our results show that LZ and T-decomposition based measures work in a wider range of contexts. We also argue that T-decomposition based measures represent a good compromise between accuracy and time complexity.
Keywords
Hamming codes; computational complexity; data compression; program compilers; Hamming measure; Lempel-Ziv family; Levenshtein measure; T-decomposition algorithm; context wider range; similarity-detection ability; string parsing algorithms; Area measurement; Automata; Compression algorithms; Computer science; Data compression; Data mining; History; Length measurement; Production; Time measurement;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Theory Workshop, 2005 IEEE
Print_ISBN
0-7803-9480-1
Type
conf
DOI
10.1109/ITW.2005.1531901
Filename
1531901
Link To Document