• DocumentCode
    3026912
  • Title

    Multiple pattern matching in LZW compressed text

  • Author

    Kida, Takuya ; Takeda, Masayuki ; Shinohara, Ayumi ; Miyazaki, Masamichi ; Arikawa, Setsuo

  • Author_Institution
    Dept. of Inf., Kyushu Univ., Fukuoka, Japan
  • fYear
    1998
  • fDate
    30 Mar-1 Apr 1998
  • Firstpage
    103
  • Lastpage
    112
  • Abstract
    We address the problem of searching in LZW compressed text directly, and present a new algorithm for finding multiple patterns by simulating the move of the Aho-Corasick (1975) pattern matching machine. The new algorithm finds all occurrences of multiple patterns whereas the algorithm proposed by Amir, Benson, and Farach (see Journal of Computer and System Sciences, vol.52, p.299-307, 1996) finds only the first occurrence of a single pattern. The new algorithm runs in O(n+m2 +ra) time using O(n+m2) space, where n is the length of the compressed text, m is the length of the total length of the patterns, and r is the number of occurrences of the patterns. We implemented a simple version of the algorithm, and showed that it is approximately twice faster than a decompression followed by a search using the Aho-Corasick machine
  • Keywords
    computational complexity; data compression; pattern matching; search problems; word processing; Aho-Corasick pattern matching machine; LZW compressed text; algorithm; compressed text length; decompression; multiple pattern matching; searching; Pattern matching;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Compression Conference, 1998. DCC '98. Proceedings
  • Conference_Location
    Snowbird, UT
  • ISSN
    1068-0314
  • Print_ISBN
    0-8186-8406-2
  • Type

    conf

  • DOI
    10.1109/DCC.1998.672136
  • Filename
    672136