• DocumentCode
    1708463
  • Title

    Polylogarithmic Approximation for Edit Distance and the Asymmetric Query Complexity

  • Author

    Andoni, Alexandr ; Krauthgamer, Robert ; Onak, Krzysztof

  • Author_Institution
    CCI, Princ.eton Univ., Princeton, NJ, USA
  • fYear
    2010
  • Firstpage
    377
  • Lastpage
    386
  • Abstract
    We present a near-linear time algorithm that approximates the edit distance between two strings within a polylogarithmic factor. For strings of length n and every fixed ε >; 0, the algorithm computes a (log n)O(1/ε) approximation in n1+ε time. This is an exponential improvement over the previously known approximation factor, 2(√log n), with a comparable running time [Ostrovsky and Rabani, J. ACM 2007; Andoni and Onak, STOC 2009]. This result arises naturally in the study of a new asymmetric query model. In this model, the input consists of two strings x and y, and an algorithm can access y in an unrestricted manner, while being charged for querying every symbol of x. Indeed, we obtain our main result by designing an algorithm that makes a small number of queries in this model. We then provide a nearly-matching lower bound on the number of queries. Our lower bound is the first to expose hardness of edit distance stemming from the input strings being “repetitive”, which means that many of their substrings are approximately identical. Consequently, our lower bound provides the first rigorous separation between edit distance and Ulam distance.
  • Keywords
    approximation theory; computational complexity; query processing; string matching; Ulam distance; asymmetric query complexity; asymmetric query model; edit distance stemming; near-linear time algorithm; nearly-matching lower bound; polylogarithmic approximation factor; rigorous separation; symbol querying; edit distance; linear-time algorithms; query complexity; sampling; sublinear algorithms;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Foundations of Computer Science (FOCS), 2010 51st Annual IEEE Symposium on
  • Conference_Location
    Las Vegas, NV
  • ISSN
    0272-5428
  • Print_ISBN
    978-1-4244-8525-3
  • Type

    conf

  • DOI
    10.1109/FOCS.2010.43
  • Filename
    5671209