DocumentCode :
2497516
Title :
Comparison of stringmatching algorithms: an aid to information content security
Author :
Du, A-Ning ; Fang, Bin-Xing ; Yun, Xiao-chun ; Hu, Ming-Zeng ; Zheng, Xiu-rong
Author_Institution :
Nat. Comput. Inf. Content Security Key Lab., Harbin Inst. of Technol., China
Volume :
5
fYear :
2003
fDate :
2-5 Nov. 2003
Firstpage :
2996
Abstract :
We analyzed the core ideas of three basic string matching algorithms (KMP, BM, DFA), described the principles of five advanced online multi-pattern matching algorithms (AC, RAC, AQR, SBOM, Mgrep) and compared the matching efficiencies of the five algorithms by searching speed, preprocessing time and memory used on three web information string sets (Chinese phases, URL strings, Email address strings), especially focusing on the infection of pattern set size and min pattern length on the efficiency. From the comparison, we find that stringmatching on Chinese text and URL strings, AQR algorithm is rather efficient; while on Email address matching, SBOM does better. The skipping matching algorithms (such as Mgrep) are much more efficient for small pattern sets. So a combined algorithm of efficient matching algorithms seems to improve the performance and efficiency of information content security systems.
Keywords :
deterministic automata; finite automata; security of data; string matching; AC; AC automata combined with quick search algorithm; AQR; Aho-Corasick automata; BM; Boyer-Moore algorithm; Chinese text; DFA; KMP; Knuth-Morris-Pratt algorithm; Mgrep algorithm; RAC; SBOM; URL strings; deterministic finite automata; email address matching; information content security systems; matching efficiency; online multipattern matching algorithm; pattern length; preprocessing time; reverse AC automata; searching speed; set backward Oracle matching; string matching algorithms; web information string sets; Algorithm design and analysis; Computer security; Doped fiber amplifiers; Electronic mail; Information security; Intrusion detection; Laboratories; National security; Pattern matching; Uniform resource locators;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics, 2003 International Conference on
Print_ISBN :
0-7803-8131-9
Type :
conf
DOI :
10.1109/ICMLC.2003.1260090
Filename :
1260090
Link To Document :
بازگشت