DocumentCode
2227091
Title
The Computational Hardness of Estimating Edit Distance [Extended Abstract]
Author
Andoni, Alexandr ; Krauthgamer, Robert
Author_Institution
Massachusetts Inst. of Technol., Cambridge
fYear
2007
fDate
21-23 Oct. 2007
Firstpage
724
Lastpage
734
Abstract
We prove the first non-trivial communication complexity lower bound for the problem of estimating the edit distance (aka Levenshtein distance) between two strings. A major feature of our result is that it provides the first setting in which the complexity of computing the edit distance is provably larger than that of Hamming distance. Our lower bound exhibits a trade-off between approximation and communication, asserting, for example, thai protocols with O(1) bits of communication can only obtain approximation a ges Omega(log d/log log d), where d is the length of the input strings. This case of O(1) communication is of particular importance, since it captures constant-size sketches as well as embaddings into spaces like L1 and squared-L2. two prevailing algorithmic approaches for dealing with edit distance. Furthermore, the bound holds not only for strings over alphabet Sigma= {0, 1}, but also for strings that are permu-tations (called the Ulam metric). Besides being applicable to a much richer class of algorithms than all previous results, our bounds are near-tight in at. least one case, namely of embedding permutations into L1. The proof uses a new technique, that relies on Fourier analysis in a rather elementary way.
Keywords
computational complexity; string matching; Levenshtein distance; nontrivial communication complexity lower bound; string edit distance estimation; Algorithm design and analysis; Approximation algorithms; Biology computing; Complexity theory; Computational biology; Computational modeling; Computer science; Hamming distance; Nearest neighbor searches; Polynomials;
fLanguage
English
Publisher
ieee
Conference_Titel
Foundations of Computer Science, 2007. FOCS '07. 48th Annual IEEE Symposium on
Conference_Location
Providence, RI
ISSN
0272-5428
Print_ISBN
978-0-7695-3010-9
Type
conf
DOI
10.1109/FOCS.2007.60
Filename
4389540
Link To Document