Title :
Accelerating regular expression matching over compressed HTTP
Author :
Becchi, Michela ; Bremler-Barr, Anat ; Hay, David ; Kochba, Omer ; Koral, Yaron
Author_Institution :
Univ. of Missouri, Columbia, MO, USA
fDate :
April 26 2015-May 1 2015
Abstract :
This paper focuses on regular expression matching over compressed traffic. The need for such matching arises from two independent trends. First, the volume and share of compressed HTTP traffic is constantly increasing. Second, due to their superior expressibility, current Deep Packet Inspection engines use regular expressions more and more frequently. We present an algorithmic framework to accelerate such matching, taking advantage of information gathered when the traffic was initially compressed. HTTP compression is typically performed through the GZIP protocol, which uses back-references to repeated strings. Our algorithm is based on calculating (for every byte) the minimum number of (previous) bytes that can be part of a future regular expression matching. When inspecting a back-reference, only these bytes should be taken into account, thus enabling one to skip repeated strings almost entirely without missing a match. We show that our generic framework works with either NFA-based or DFA-based implementations and gains performance boosts of more than 70%. Moreover, it can be readily adapted to most existing regular expression matching algorithms, which usually are based either on NFA, DFA or combinations of the two. Finally, we discuss other applications in which calculating the number of relevant bytes becomes handy, even when the traffic is not compressed.
Keywords :
data compression; deterministic automata; finite automata; hypermedia; pattern matching; transport protocols; DFA-based implementation; GZIP protocol; NFA-based implementation; compressed HTTP traffic; deterministic finite automata; hypertext transfer protocol; nondeterministic finite automata; regular expression matching; Acceleration; Automata; Computers; Conferences; Estimation; Inspection; Pattern matching;
Conference_Titel :
Computer Communications (INFOCOM), 2015 IEEE Conference on
Conference_Location :
Kowloon
DOI :
10.1109/INFOCOM.2015.7218421