DocumentCode :
1048179
Title :
An efficient algorithm to compute differences between structured documents
Author :
Lee, Kyong-Ho ; Choy, Yoon-Chul ; Cho, Sung-Bae
Author_Institution :
Dept. of Comput. Sci., Yonsei Univ., South Korea
Volume :
16
Issue :
8
fYear :
2004
Firstpage :
965
Lastpage :
979
Abstract :
SGML/XML are having a profound impact on data modeling and processing. We present an efficient algorithm to compute differences between old and new versions of an SGML/XML document. The difference between the two versions can be considered to be an edit script that transforms one document tree into another. The proposed algorithm is based on a hybridization of bottom-up and top-down methods: The matching relationships between nodes in the two versions are produced in a bottom-up manner and then the top-down breadth-first search computes an edit script. Faster matching is achieved because the algorithm does not need to investigate the possible existence of matchings for all nodes. Furthermore, it can detect structurally meaningful changes such as the movement and copy of a subtree as well as simple changes to the node itself like insertion, deletion, and update.
Keywords :
XML; computational complexity; data models; tree data structures; tree searching; SGML; XML; bottom-up methods; breadth-first search; change detection; computational complexity; data modeling; data processing; edit operation; edit script; structured documents; subtree; top-down methods; Business; Computer Society; Data mining; Databases; Design methodology; Electronic commerce; SGML; Software libraries; Warehousing; XML; 65; Change detection; SGML; XML.; difference computation; edit operation; edit script; structured document;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2004.19
Filename :
1318581
Link To Document :
بازگشت