• DocumentCode
    1762371
  • Title

    Bypassing Space Explosion in High-Speed Regular Expression Matching

  • Author

    Patel, Jatin ; Liu, Alex X. ; Torng, Eric

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Michigan State Univ., East Lansing, MI, USA
  • Volume
    22
  • Issue
    6
  • fYear
    2014
  • fDate
    Dec. 2014
  • Firstpage
    1701
  • Lastpage
    1714
  • Abstract
    Network intrusion detection and prevention systems commonly use regular expression (RE) signatures to represent individual security threats. While the corresponding deterministic finite state automata (DFA) for any one RE is typically small, the DFA that corresponds to the entire set of REs is usually too large to be constructed or deployed. To address this issue, a variety of alternative automata implementations that compress the size of the final automaton have been proposed such as extended finite automata (XFA) and delayed input DFA (D 2FA). The resulting final automata are typically much smaller than the corresponding DFA. However, the previously proposed automata construction algorithms do suffer from some drawbacks. First, most employ a “Union then Minimize” framework where the automata for each RE are first joined before minimization occurs. This leads to an expensive nondeterministic finite automata (NFA) to DFA subset construction on a relatively large NFA. Second, most construct the corresponding large DFA as an intermediate step. In some cases, this DFA is so large that the final automaton cannot be constructed even though the final automaton is small enough to be deployed. In this paper, we propose a “Minimize then Union” framework for constructing compact alternative automata focusing on the D 2FA. We show that we can construct an almost optimal final D 2FA with small intermediate parsers. The key to our approach is a space- and time-efficient routine for merging two compact D 2FA into a compact D 2FA. In our experiments, our algorithm runs on average 155 times faster and uses 1500 times less memory than previous algorithms. For example, we are able to construct a D 2FA with over 80 000 000 states using only 1 GB of main memory in only 77 min.
  • Keywords
    deterministic automata; finite state machines; pattern matching; telecommunication security; D2FA; NFA; RE; XFA; bypassing space explosion; delayed input DFA; deterministic finite state automata; expensive nondeterministic finite automata; extended finite automata; high-speed regular expression matching; minimize then union framework; network intrusion detection; network prevention systems; regular expression signatures; security threats; union then minimize framework; Automata; Explosions; Intrusion detection; Memory management; Minimization; Standards; Deep packet inspection; information security; intrusion detection and prevention; network security; regular expression matching;
  • fLanguage
    English
  • Journal_Title
    Networking, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1063-6692
  • Type

    jour

  • DOI
    10.1109/TNET.2014.2309014
  • Filename
    6807837