DocumentCode
1837244
Title
An Edit Distance Algorithm with Block Swap
Author
Xia, Tian
Author_Institution
Key Lab. of Data Eng. & Knowledge Eng., Renmin Univ. of China, Beijing
fYear
2008
fDate
18-21 Nov. 2008
Firstpage
54
Lastpage
59
Abstract
The edit distance between two given strings X and Y is the minimum number of edit operations that transform X into Y. In ordinary course, string editing is based on character insert, delete, and substitute operations. It has been suggested that extending this model with block edits would be useful in applications such as DNA sequence comparison and sentence similarity computation. However, the existing algorithms have generally focused on the normalized edit distance, and seldom of them consider the block swap operations at a higher level. In this paper, we introduce an extended edit distance algorithm which permits insertions, deletions, and substitutions at character level, and also permits block swap operations. Experimental results on randomly generated strings verify the algorithm´s rationality and efficiency. The main contribution of this paper is that we present an algorithm to compute the lowest edit cost for string transformation with block swap in polynomial time, and propose a breaking points selection algorithm to improve the computation speed.
Keywords
string matching; block swap operation; breaking points selection algorithm; character insert; normalized edit distance algorithm; string editing; string transformation; Conference management; Data engineering; Engineering management; Information management; Information resources; Knowledge engineering; Knowledge management; Laboratories; Resource management; Sequences; block swap; edit distance; edit operation; string matching;
fLanguage
English
Publisher
ieee
Conference_Titel
Young Computer Scientists, 2008. ICYCS 2008. The 9th International Conference for
Conference_Location
Hunan
Print_ISBN
978-0-7695-3398-8
Electronic_ISBN
978-0-7695-3398-8
Type
conf
DOI
10.1109/ICYCS.2008.14
Filename
4708948
Link To Document