• DocumentCode
    610071
  • Title

    Computing Convolution on Grammar-Compressed Text

  • Author

    Tanaka, T. ; Tomohiro, I. ; Inenaga, S. ; Bannai, H. ; Takeda, Masanori

  • Author_Institution
    Dept. of Inf., Kyushu Univ., Fukuoka, Japan
  • fYear
    2013
  • fDate
    20-22 March 2013
  • Firstpage
    451
  • Lastpage
    460
  • Abstract
    The convolution between a text string S of length N and a pattern string P of length m can be computed in O(N log m) time by FFT. It is known that various types of approximate string matching problems are reducible to convolution. In this paper, we assume that the input text string is given in a compressed form, as a straight-line program (SLP), which is a context free grammar in the Chomsky normal form that derives a single string. Given an SLP S of size n describing a text S of length N, and an uncompressed pattern P of length m, we present a simple O(nm log m)-time algorithm to compute the convolution between S and P. We then show that this can be improved to O(min{nm, N - α} log m) time, where α ≥ 0 is a value that represents the amount of redundancy that the SLP captures with respect to the length-m substrings. The key of the improvement is our new algorithm that computes the convolution between a trie of size r and a pattern string P of length m in O(r log m) time.
  • Keywords
    computational complexity; context-free grammars; string matching; text analysis; Chomsky normal form; O(min{nm, N - α} log m) time; O(nm log m)-time algorithm; SLP; approximate string matching problems; context free grammar; convolution computation; grammar-compressed text; input text string; straight-line program; Computers; Context; Convolution; Data compression; Grammar; Pattern matching; Vectors; convolution; straight-line program;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Compression Conference (DCC), 2013
  • Conference_Location
    Snowbird, UT
  • ISSN
    1068-0314
  • Print_ISBN
    978-1-4673-6037-1
  • Type

    conf

  • DOI
    10.1109/DCC.2013.53
  • Filename
    6543081