DocumentCode
2534211
Title
Differencing data streams
Author
Chawathe, Sudarshan S.
Author_Institution
Dept. of Comput. Sci., Maryland Univ., College Park, MD, USA
fYear
2005
fDate
25-27 July 2005
Firstpage
273
Lastpage
284
Abstract
We present external-memory algorithms for differencing large hierarchical datasets. Our methods are especially suited to streaming data with bounded differences. For input sizes m and n and maximum output (difference) size e, the I/O, RAM, and CPU costs of our algorithm rdiff are, respectively, m + n, 4e + 8, and O(MN). That is, given 4e + 8 blocks of RAM, our algorithm performs no I/O operations other than those required to read both inputs. We also present a variant of the algorithm that uses only four blocks of RAM, with I/O cost 8me + 18m + n + 6e + 5 and CPU cost O(MN).
Keywords
computational complexity; data handling; random-access storage; storage management; very large databases; RAM; data streams; external-memory algorithms; large hierarchical dataset differencing; Change detection algorithms; Computer science; Costs; Database systems; Educational institutions; Random access memory; Read-write memory; Spatial databases; Testing; Warehousing;
fLanguage
English
Publisher
ieee
Conference_Titel
Database Engineering and Application Symposium, 2005. IDEAS 2005. 9th International
ISSN
1098-8068
Print_ISBN
0-7695-2404-4
Type
conf
DOI
10.1109/IDEAS.2005.21
Filename
1540917
Link To Document